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set  of  dynamic  nodes  or  agents  which  coordinate  to  perform  the  mission  objectives.  These 
coordinated  clusters  could  be  composed  of  an  array  of  satellites  constructing  a  large  aperture 
radar  or  a  swarm  of  UAV’s  used  to  suppress  enemy  air  defenses.  These  mission  objectives 
are  to  be  achieved  in  the  presence  of  large  uncertainties  due  largely  to  a  hostile  environment. 
Within  this  context,  nodes  may  fail  at  various  levels,  measurements  may  be  highly  corrupted 
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links  are  further  challenged  due  to  power  constraints  and  large  spatial  dispersion,  producing 
tradeoffs  between  uncertain  information,  latency  and  bandwidth  constraints.  A  decision 
and  allocation  process  appears  computationally  intractable,  especially  if  mechanized  using  a 
centralized  architecture. 

Over  the  past  three  years,  important  insights  have  been  gained  and  significant  progress 
has  been  made  on  certain  basic  issues  associated  with  the  development  of  decentralized 
fault  detection  and  identification,  and  control  algorithms  for  non-classical  information  pat¬ 
terns.  These  efforts  allowed  an  appreciation  of  the  complexity  of  the  decentralized  problem, 
but  more  importantly  they  showed  the  directions  that  should  be  taken  to  make  significant 
progress  in  the  fundamental  issues  of  distributed  estimation,  analytical  redundancy  manage¬ 
ment,  and  control. 

In  particular,  control  issues  involving  the  data  transmission  through  noisy  channels  are 
explored.  Furthermore,  the  problem  of  detecting  faults  in  local  agents  and  the  development 
of  a  decentralized  methodology  for  distributed  redimdancy  management  was  addressed  and 
some  resolution  to  these  problems  obtained.  These  results  have  given  new  direction  in  the 
development  of  a  theory  for  the  control  of  dynamic  networks. 
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1  Introduction 

One  of  the  most  important  findings  of  the  Air  Force  Scientific  Advisory  Board  Summer 
Study  on  UAV’s,  from  the  mission  system  viewpoint,  is  that  in  most  operational  tasks, 
UAV’s  frequently  should  be  employed  in  coordinated  clusters  rather  than  as  independent 
platforms.  This  notion  should  also  be  applied  to  the  use  of  clusters  of  satellites  configured 
to  produce,  for  example,  a  large  aperture  radar.  These  clusters  of  cooperating  agents  may 
be  characterized  by  a  spatially  distributed  set  of  dynamic  nodes  where  individual  agents 
have  access  to  regional  information  and  can  share  data  through  a  communications  network. 
Within  this  context,  nodes  may  fail  at  various  levels,  measurements  may  be  highly  corrupted 
and  communication  channels  are  challenged  due  to  power  constraints,  noisy  information, 
latency  and  bandwidth  constraints. 

The  system  requirements  are  induced  by  the  mission  objectives,  the  mission  environment, 
and  the  system  capability.  The  system  capability  depends  upon  the  t3q)e  and  number  of  as¬ 
sets.  These  assets  are  fused  together  by  an  information  system  which  integrates  all  available 
information  over  all  assets  or  nodes  in  a  wireless  communication  data  network.  Upon  this 
data  network  is  imposed  a  decision  system  which  directs  assets  such  that  mission  objec¬ 
tives  are  met.  The  development  of  such  a  system  is  a  great  intellectual  challenge.  Current 
approaches  impose  heuristic  management  architectures  on  a  hierarchy  of  system  functions 
and  use  nonparametric  schemes  to  produce  the  decision  processes.  From  these  approaches, 
little  can  be  understood  about  the  value  of  information  and  the  decision  processes  that  use 
this  information.  Since  transmission  of  information  is  necessarily  hmited,  it  is  important  to 
communicate  only  the  information  that  is  most  valuable  for  mission  success  within  the  data 
network.  Furthermore,  decision  rules  are  to  be  constructed  that  best  utilize  information  for 
the  control  and  guidance  of  a  particular  node,  or  to  enhance  the  awareness  of  the  other  nodes 
with  minimal  transmissions. 

Since  the  control  of  dynamic  networks  is  essentially  in  its  infancy,  creating  general  proce¬ 
dures  that  can  be  eflBciently  implemented  is  a  long  term  goal  of  any  well  conceived  research 
program.  However,  near  term  goals  could  be  posed  which  represent  important  elements  of 
the  more  complete  problem.  To  this  end,  over  the  last  three  years,  significant  progress  has 
been  made  on  the  development  of  fault  detection  filters,  decentralized  fault  detection  and 
identification,  and  control  of  systems  with  nonclassical  information  patterns.  In  Section  2, 
this  work  is  briefly  described  with  supporting  documentation  in  the  appendices.  In  Section 
2.1,  the  effect  of  noisy  information  and  the  type  of  information  that  must  be  transmitted  to 
ensure  stable  and  well  performing  decentralized  control  is  illuminated.  In  Section  2.2,  the 
structure  of  the  decentralized  detection  filter  required  for  analytic  redundancy  management 
of  a  cluster  of  agents  has  indicated  the  direction  for  a  more  complete  theory  on  the  decom¬ 
position  of  the  global  system  into  local  systems.  For  example,  this  decomposition  should 
achieve  the  minimal  distribution  of  information  in  the  dynamic  network.  In  a  distributed 
sensing  and  actuation  architecture,  individual  agents  have  access  to  regional  information 
which,  perhaps,  can  be  shared  under  a  data  communication  network.  The  results  of  Section 

2  are  important  because  they  show  the  directions  for  developing  a  systematic  methodology 
for  coordinating  a  distributed  set  of  local  systems  for  global  operation. 
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2  Progress  Over  the  Grant  Period 

A  consequence  of  cooperative  missions  is  that  the  agents  may  require  robust,  high-performance 
data  networks  for  information  exchange.  This  need  drives  a  new  research  direction  in  the 
theory  of  the  control  of  dynamic  networks.  The  essence  of  determining  the  architecture  for 
a  cooperative  system  of  agents  under  large  uncertainties  is  to  ensure  valued  information  is 
distributed  in  a  timely,  way.  In  Section  2.1  control  issues  involving  the  data  transmission 
through  noisy  channels  are  explored.  It  is  shown  that  if  the  noise  is  not  excessive,  then 
linear  controllers  with  modified  gains  are  nearly  optimal.  Furthermore,  the  form  of  the  data 
to  be  transmitted  has  an  enormous  impact  on  the  system  stability  as  well  as  performance. 
In  Section  2.2  the  problem  of  detecting  faults  in  the  local  agents  and  the  development  of 
a  decentralized  methodology  for  distributed  redundancy  management  are  described.  Two 
important  new  results  are  obtained.  First,  we  develop  a  new  robust  detection  filter  based 
on  a  disturbance  attenuation  methodology,  which  allows  the  detection  and  identification  of 
multiple  faults.  Second,  a  decentrahzed  detection  filter  is  developed. 

2.1  Decentralized  Control  with  Noisy  Transmission  of  Information 

In  [1,  Appendix  A],  a  simple  example  of  a  decentralized  control  problem  with  noisy  infor¬ 
mation  transmission  was  investigated.  This  example  is  a  reformulation  of  the  Witsenhausen 
counterexample  which  allows  the  first  station  to  send  its  information  to  the  second  station 
through  an  additive  white  Gaussian  noise  channel.  We  show  that  Witsenhausen’s  original 
counterexample  can  be  seen  as  a  limit  case  in  this  new  formulation.  We  believe  that  this 
new  formulation  is  closer  to  many  applications  in  large  scale  systems,  where  different  pieces 
of  information  could  be  transmitted  among  the  stations  through  some  noisy  channels.  We 
should  note  that  as  soon  as  some  kind  of  communication  uncertainty  is  introduced  for  the 
transmission,  the  information  pattern  is  no  longer  classical  and  the  cost  may  no  longer  be 
convex  in  the  strategies.  Hence,  the  optimal  strategies,  which  may  not  even  be  unique, 
are  very  difficult  to  find.  Similar  approaches  that  have  so  far  been  used  for  the  Witsen¬ 
hausen  problem,  might  be  applied  to  this  new  formulation  as  well.  For  example,  asymptotic 
approaches  using  expansions  in  small  e  can  be  used. 

In  [2,  Appendix  B],  we  considered  the  case  where  the  communication  uncertainty  is  small. 
We  followed  an  asymptotic  approach  where  we  approximated  the  cost  based  on  its  expansion 
in  terms  of  the  small  transmission  noise  intensity.  We  showed  how  minimizing  the  approx¬ 
imated  cost  can  be  seen  as  a  singular  optimization  problem.  We  then  used  a  variational 
approach  in  order  to  find  the  necessary  conditions  for  the  asymptotically  optimal  strategies 
and  showed  that  some  reasonable  linear  strategies  would  actually  satisfy  those  conditions. 
We  also  provided  some  intuitive  explanations  for  the  behavior  of  those  linear  strategies  and 
obtained  their  corresponding  cost.  All  the  derivations  and  results  in  this  paper  show  some  of 
the  difficulties  involved  in  dealing  with  decentralized  systems  as  soon  as  we  deviate  a  little 
bit  from  a  classical,  or  at  least  a  partially  nested,  information  pattern.  On  the  other  hand, 
although  we  have  modeled  the  communication  uncertainty  in  the  simplest  possible  way,  we 
have  tried  to  emphasize  the  role  of  communication  uncertainties  in  generating  information 
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patterns  that  are  very  difficult  to  handle.  Even  though  the  optimization  problem  is  generally 
difficult  for  this  class  of  systems,  in  some  applications  we  might  be  able  to  exploit  the  spe¬ 
cific  structure  of  the  system  in  order  to  obtain  some  reasonably  good  sup-optimal  strategies, 
which  would  yield  an  acceptable  performance. 

Finally,  in  [3,  Appendix  C],  a  more  general  two-station  decentralized  LQG  problem  was 
formulated,  where  the  local  controllers  had  to  be  designed  based  on  some  local  information 
in  order  to  minimize  a  single  common  cost.  This  problem  generally  has  a  non-classical 
information  pattern  and  the  optimal  controls  are  usually  unknown.  One  of  the  first  possible 
sub-optimal  approaches  is  to  decompose  the  problem  into  separate  centralized  problems.  In 
this  paper,  we  investigated  such  an  approach  for  different  communication  scenarios  between 
the  stations,  namely,  when  the  stations  communicate  their  controls,  their  measurements 
or  both,  or  their  estimation  residuals.  We  showed  that  even  though  our  approach  is  quite 
reasonable  for  the  case  where  the  stations  communicate  all  their  measurements,  it  may  fail  to 
stabilize  the  closed-loop  system  as  soon  as  the  compensator  is  unstable.  Then,  we  showed  how 
this  difficulty  can  be  removed  if  the  stations  either  communicate  both  their  measurements 
and  their  controls  or  communicate  their  estimation  residuals.  AH  these  results  show  some 
of  the  fundamental  differences  between  the  centralized  and  the  decentrahzed  structures. 
Moreover,  we  have  tried  to  elaborate  on  the  role  of  communication  among  the  stations  and 
corresponding  uncertainties.  While  many  new  applications  for  spatially  distributed  dynamic 
systems  are  emerging,  there  are  still  major  difiiculties  that  need  to  be  addressed. 

2.2  Detection  Filters  for  Robust  Analytical  Redundancy 

Any  system  under  automatic  control  demands  a  high  degree  of  system  refiability.  Therefore, 
the  system  relies  on  the  health  of  the  sensors,  plant,  and  actuators.  If  a  system  fault 
occurs,  the  controller  will  not  work  properly.  If  a  sensor  fails,  the  command  generated  by 
the  controller  will  be  based  on  the  wrong  information.  If  an  actuator  fails,  the  controller’s 
command  will  not  be  executed  properly  in  the  system.  Therefore,  a  health  monitoring  system 
capable  of  detecting  a  fault  as  it  occurs  and  identifying  the  faulty  component  is  required.  The 
most  common  approach  is  hardware  redundancy,  which  is  the  direct  comparison  of  identical 
components.  This  approach  requires  very  little  computation.  However,  hardware  redundancy 
is  expensive  and  limited  by  space  and  weight.  An  alternative  is  analytical  redundancy,  which 
uses  a  modeled  dynamic  relationship  between  system  inputs  and  measured  system  outputs 
to  form  a  residual  process  used  for  detecting  and  identifying  faults.  Nominally,  the  residual  is 
nonzero  only  when  a  fault  has  occurred  and  is  zero  at  other  times.  Therefore,  no  redundant 
components  are  needed.  However,  additional  computation  is  required. 

A  popular  approach  to  analytical  redundancy  is  the  detection  filter  which  was  first  intro¬ 
duced  by  [4]  and  refined  by  [5].  It  is  also  known  as  the  Beard- Jones  fault  detection  filter.  A 
geometric  interpretation  of  this  filter  is  given  in  [6]  and  a  spectral  theory  and  implementation 
appeared  in  [7].  Design  algorithms  have  been  developed  [8,9]  which  improved  detection  filter 
robustness.  The  idea  of  a  detection  filter  is  to  put  the  reachable  subspace  of  each  fault  into 
invariant  subspaces  which  do  not  overlap  with  each  other.  Then,  when  a  nonzero  residual 
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is  detected,  a  fault  can  be  announced  and  identified  by  projecting  the  residual  onto  each  of 
the  invariant  subspaces.  Therefore,  multiple  faults  can  be  monitored  in  one  filter. 

Another  related  approach,  the  unknown  input  observer  [10],  simplifies  the  detection  filter 
problem  by  dividing  the  faults  into  a  target  fault  and  a  group  of  nuisance  faults  where  the 
nuisance  faults  are  placed  into  one  unobservable  subspace.  Although  only  one  fault  can  be 
detected  in  each  unknown  input  observer,  the  additional  flexibility  in  robust  fault  detection 
filter  design  for  general  time-varying  systems  is  obtained  by  using  this  approximate  fault 
detection  filter. 

Four  fault  detection  and  identification  algorithms  were  developed,  progressively  improv¬ 
ing  the  relationship  between  robustness  and  detection  and  identification.  They  are  the 
game  theoretic  fault  detection  filter  (an  approximate  unknown  input  observer),  the  optimal 
stochastic  fault  detection  filter  (an  approximate  unknown  input  observer),  the  residual- 
sensitive  fault  detection  filter  (an  approximate  unknown  input  observer),  and  the  optimal 
stochastic  multiple-fault  detection  filter  (an  approximate  Beard- Jones  fault  detection  filter). 

2.2.1  A  Game-Theoretic  Fault  Detection  Filter 

In  [11]  we  posed  and  solved  a  disturbance  attenuation  problem  which  closely  approximates 
the  actions  of  a  fault  detection  filter.  The  end  product  is  a  game  theoretic  filter  which 
acts  as  an  approximate  unknown  input  observer.  We  also  showed  that  this  approximation 
can  be  made  more  and  more  exact  until,  in  the  limit,  the  game  theoretic  filter  becomes  an 
unknown  input  observer  exactly.  A  related  result  is  that  a  reduced-order  observer  can  also  be 
obtained  from  the  limiting  case.  The  disturbance  attenuation-based  approach  that  we  have 
introduced  here  leads  to  filters  which  are  more  flexible,  more  robust,  and  more  applicable, 
than  existing  fault  detection  structures.  This  approach  allows  time-varying  systems  to  be 
monitored  for  the  first  time.  Finally,  in  the  course  of  our  fimiting  case  analysis,  we  showed 
that  singular  optimization  theory  can  be  used  to  analyze  the  asymptotic  properties  of  game 
theoretic  estimators.  It  is  possible  that  the  application  of  singular  optimization  theory  to 
other  disturbance  attenuation  problems  can  lead  to  similar  insights. 

2.2.2  Optimal  Stochastic  Fault  Detection  Filter 

Properties  of  the  optimal  stochastic  fault  detection  filter  for  fault  detection  and  identification 
are  determined  in  [12,  Appendix  C].  The  objective  of  the  filter  is  to  monitor  a  certain  fault 
called  the  target  fault  and  block  other  faults  which  are  called  nuisance  faults.  This  filter  is 
derived  by  keeping  the  ratio  of  the  transmission  from  nuisance  faults  to  the  transmission  from 
the  target  fault  small.  Rather  than  an  arbitrary  function,  the  fault  amplitudes  are  modeled 
as  white  noise  input  processes.  It  is  shown  that  this  filter  approximates  the  properties  of  the 
classical  fault  detection  filter  such  that  in  the  limit,  where  the  ratio  of  the  transmissions  is 
zero,  the  optimal  stochastic  fault  detection  filter  is  equivalent  to  the  unknown  input  observer. 
However,  the  nuisance  fault  directions  and  their  associated  invariant  zero  directions  must  be 
included  in  the  invariant  subspace  generated  by  this  fault  detection  filter.  Fault  detection 
filter  designs  can  be  obtained  for  both  linear  time-invariant  and  time-varying  systems. 
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2.2.3  A  Generalized  Least-Squares  Fault  Detection  Filter 

The  generalized  least-squares  fault  detection  filter  in  [13,  Appendix  D]  is  derived  from  solving 
a  min-max  problem  which  makes  the  residual  sensitive  to  the  target  fault,  but  not  to  the 
nuisance  fault.  This  is  an  alternate  derivation  of  the  optimal  stochastic  fault  detection 
filter  [12]  of  Section  2.2.2.  In  the  limit,  as  the  nuisance  fault  weighting  goes  to  zero,  this 
filter  is  equivalent  to  an  imknown  input  observer  which  puts  the  nuisance  fault  into  an 
unobservability  subspace.  Furthermore,  there  exists  a  reduced-order  filter  in  the  limit.  Since 
the  target  fault  is  explicit  in  this  derivation,  the  reduced-order  filter  is  found  with  respect 
to  the  target  fault  direction  and  weighting.  This  aspect  is  different  from  that  of  the  game 
theoretic  detection  filter  [11]  where  this  dependence  does  not  exist.  This  filter  also  extends 
the  unknown  input  observer  to  a  time-varying  system. 

2.2.4  Optimal  Stochastic  Multiple-Fault  Detection  Filter 

The  optimal  stochastic  multi-fault  detection  filter  [14,  Appendix  E]  is  a  generalization  of  the 
optimal  stochastic  single-fault  filter.  The  residual  space  of  the  filter  is  divided  into  several 
subspaces  and  each  subspace  is  sensitive  to  only  its  target  fault,  but  not  the  nuisance  faults, 
in  the  sense  that  the  ratio  of  the  transmission  from  the  nuisance  faults  to  the  transmission 
from  target  fault  is  small.  In  the  limit  as  the  ratio  goes  to  zero  and  in  the  absence  of 
sensor  noise  and  a  complementary  subspace,  this  filter  is  equivalent  to  a  Beard-Jones  fault 
detection  filter  which  puts  each  fault  into  an  unobservable  subspace.  This  filter  has  the 
advantages  of  the  unknown  input  observer  in  that  it  can  be  designed  for  robustness  and 
time-varying  systems,  and  the  advantages  of  the  Beard-Jones  fault  detection  filter  by  being 
capable  of  detecting  multiple  faults  in  one  filter.  Although  there  is  additional  computation 
to  determine  the  filter  gain  and  projectors,  this  can  be  done  off-line  so  that  implementation 
is  as  straightforward  as  the  Beard-Jones  fault  detection  filter. 

2.2.5  A  Decentralized  Fault  Detection  Filter 

In  [15,  Appendix  F]  we  introduce  the  decentralized  fault  detection  filter  which  is  the  structure 
that  results  from  merging  decentralized  estimation  theory  with  the  game  theoretic  fault 
detection  filter.  A  decentralized  approach  may  be  the  ideal  way  to  monitor  the  health  of  large- 
scale  systems  since  it  breaks  the  problem  down  into  smaller  pieces  and  it  is  easily  scalable. 
An  essential  feature  is  that  the  local  measurements,  which  may  include  the  information  of 
the  relative  state  space  between  agents,  and  the  fault  direction,  which  may  also  be  associated 
with  the  inter  agent  measurements,  produce  local  state  spaces  from  the  global  state  by  a 
minimal  realization.  This  local  state  space  contains  information  associated  with  multiple 
agents. 
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Abstract 

We  consider  an  example  for  a  decentralized  stochastic  opti¬ 
mal  control  problem,  where  the  non-classical  nature  of  the 
information  pattern  is  induced  by  the  transmission  noise  in 
the  system.  This  example  is  a  reformulation  of  the  Witsen- 
hausen  counterexample,  where  the  first  station  is  allowed  to 
send  its  information  to  the  second  station  through  an  addi¬ 
tive  white  Gaussian  noise  channel.  ,We  establish  the  non¬ 
convexity  of  the  problem  in  this  new  formulation  and  show 
that  the  problem  considered  here  converges  asymptotically 
to  the  classical  problem  or  the  Witsenhausen  problem,  as  the 
transmission  noise  intensity  converges  to  zero  or  diverges  to 
infinity,  respectively. 


1  Introduction 

Dealing  with  large  scale  systems  has  become  a  great  chal¬ 
lenge  for  systems  analysts  and  engineers  more  than  ever. 
There  are  many  such  systems,  which  are  composed  of  a 
large  number  of  complex  interconnected  subsystems  and 
hence  do  not  satisfy  the  centrality  assumption  that  is  preva¬ 
lent  among  classical  engineering  approaches.  One  of  the 
main  characteristics  of  these  systems  is  that  distributed  de¬ 
cisions  must  be  made  .based  on  decentralized  information. 
Different  stations  may  conununicate  with  each  other,  possi¬ 
bly  by  signaling  through  noisy  channels.  The  control  prob¬ 
lem  is  to  develop  coordinated  strategies  for  the  stations  in 
order  to  achieve  a  common  objective. 

The  way  that  information  is  distributed  in  a  decentralized 
system  highly  affects  the  performance  of  the  controlled 
system.  Changes  in  the  information  pattern  will  produce 
changes  in  the  optimal  achievable  cost.  Even  though  there 
are  always  some  constraints  on  how  the  information  can  be 
distributed  in  a  physical  system  (where  to  put  the  sensors 
and  the  actuators,  what  to  transmit,  etc.),  in  general,  there 

•  This  research  was  supported  in  part  by  the  National  Science  Foun¬ 
dation  under  Grant  ECS-9502945,  Air  Force  Office  of  Scientific  Research 
under  Grant  F49620-97- 1-0272  and  Office  of  Naval  Research  under  Award 
N00014-97- 1-0939 


are  many  possible  information  patterns  for  a  given  system. 

When  the  stations  do  not  have  access  to  the  same  informa¬ 
tion  and/or  some  stations  do  not  have  perfect  recall,  i.e., 
they  lose  information,  we  have  a  non-classical  information 
pattern.  Optimal  strategies  for  decentralized  systems  with 
general  non-classical  patterns  are  still  unknown.  One  main 
difficulty  is  that  the  information  available  to  one  station  may 
not  be  sufficient  to  determine  the  previous  actions  by  other 
stations,  which  have  affected  that  information.  This  will  de¬ 
stroy  the  convexity  of  the  cost  function  with  respect  to  the 
strategies,  even  though  it  may  look  convex  in  the  controls. 

In  1968,  Wtsenhausen  provided  a  simple  example  in  [8], 
where  there  are  only  two  stations,  the  dynamics  are  linear, 
the  underlying  uncertainties  are  additive  and  Gaussian  and 
the  cost  is  quadratic.  The  information  pattern,  however, 
is  non-classical.  He  established  the  existence  of  the  opti¬ 
mal  design  and  by  proposing  a  nonlinear  set  of  strategies, 
showed  that  no  affine  strategy  could  be  optimal.  This  seem¬ 
ingly  simple  example,  which  is  also  called  Witsenhausen’s 
counterexample,  turned  out  to  be  extremely  hard.  It  is  still 
outstanding  after  30  years.  This  example  in  fact  motivated 
much  research  on  the  links  between  decentralized  stochas¬ 
tic  control  problems  and  team  theory  and  the  effects  of  dif¬ 
ferent  information  patterns  on  decentralized  systems.  Al¬ 
though  it  is  a  very  simple  example,  it  demonstrates  the  main 
difficulties  induced  by  non-classical  information  patterns. 

In  the  next  section,  we  reformulate  Wisenhausen’s  problem 
by  assuming  that  the  first  station  sends  its  information  to  the 
second  station  through  a  noisy  channel.  In  Section  3,  we 
obtain  an  alternative  form  for  the  performance  index  in  this 
new  formulation,  which  shows  the  possible  non-convexity 
of  the  cost  with  respect  to  the  strategies.  In  Section  4, 
we  consider  two  limit  cases,  namely  when  the  transmission 
noise  intensity  is  small  and  when  it  becomes  very  large.  We 
will  see  how  this  new  formulation  covers  a  wide  range  of 
problems  from  classical  LtJG  problem  to  the  Witsenhausen 
counterexample.  Finally,  Section  5  contains  some  conclud¬ 
ing  remarks. 


2  Problem  Statement 


Similar  to  the  Witsenhausen  problem,  we  define: 


Consider  a  two-stage  stochastic  problem  with  the  following 
state  equations: 


xi  =  xo+ui  (2.1) 

X2  =  Xi“U2,  (2.2) 


where  xq  is  the  random  initial  state,  which  is  assumed  to  be 
Gaussian  with  zero  mean  and  variance  cTq.  The  information 
available  to  the  two  stations  is  determined  by  the  following 
output  equations: 


Zl 

Z2 


Xq 


Xq  +  eVt 

A 

*  Z21  ' 

Xo+‘Ui+V2 

.  ^22 

(2.3) 

(2.4) 


where  V2  is  the  measurement  noise  for  the  second  station, 
which  is  again  assumed  to  be  a  zero  mean  Gaussian  ran¬ 
dom  variable  with  unit  variance.  As  we  can  see,  the  in¬ 
formation  available  to  the  first  station  is  being  transmitted 
to  the  second  station  through  an  additive  white  Gaussian 
noise  channel  with  evt  ^  A/*  (0,c^)  being  the  transmission 
noise.  Also  xq,  V2  and  vt  are  all  assumed  to  be  independent 
of  each  other.  Note  that  the  communication  uncertainty  is 
simply  modeled  as  an  additive  Gaussian  noise.  This  model 
may  not  be  very  realistic  when  digital  communication  is 
used.  However,  since  there  are  already  some  major  diffi¬ 
culties  in  dealing  with  non-classical  information  patterns, 
using  more  complicated  models  for  the  communication  un¬ 
certainties  may  not  be  very  reasonable  at  this  point. 


The  objective  is  now  to  design  the  controls: 

ui  =  7i(^i)  (2-5) 

U2  =  72  {Z2)  y  (2.6) 


in  order  to  minimize  the  following  cost  function: 

J  =  E[k^ul  +  xl],  (2.7) 

where  >  0  is  a  given  constant.  We  see  that  the  first 
controller  has  perfect  information  but  its  action  is  costly.  In 
contrast,  the  second  controller  has  inexpensive  control  but 
noisy  information.  Since  the  second  station  does  not  know 
exactly  what  the  first  station  knew,  due  to  the  communica¬ 
tion  uncertainty,  we  don’t  have  perfect  recall  and  hence  we 
still  have  a  non-classical  pattern.  If  there  was  no  transmis¬ 
sion  noise,  we  would  have  a  classical  information  pattern 
for  which  a  set  of  strategies,  which  are  linear  in  the  infor¬ 
mation,  is  known  to  be  the  unique  optimal  solution. 


/(zi)  =  Zl  +  71  (zi)  =  +  Ui  (3.1) 

9{z2)  =  72(Z2)=U2.  (3.2) 

Then  the  cost  can  be  expressed  as: 

J  —  E  +  X2] 

=  E  [A:2  {zi  -  /  (zi))='  +  (/  (zi)  -  g  (zj))'] 
t  J{f,g).  (3.3) 


It  is  clear  that  for  a  fixed  strategy  /,  the  optimal  strategy  g 
is  the  conditional  expectation,  i.e.,: 

5*  (Z2)  =  arg min  J (/, g)  =  E[f  (zi)  |z2 ] .  (3.4) 
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Substituting  back  in  the  cost,  we  get: 

nf)  =  Jif,9n 

=  k^E  [(zi  -  /  (zi))"]  +  E  [if  (zi)  -  5*  (Z2))"] 

=  k^E[(zi-f  {zi)f]+E[{f  (zi))^]-£;[(ff*,(z2))"]  (3.5) 

where  we  have  used  the  orthogonality  property  of  the  con¬ 
ditional  expectation: 

E  [{f{zi)- 9*  {Z2))  9*  {Z2)]  =  0.  (3.6) 

It  is  important  to  note  the  minus  sign  in  the  third  term  in 
(3.5) .  As  we  shall  see,  this  minus  sign  could  indeed  destroy 
the  convexity  of  the  cost  with  respect  to  the  strategies.  - 

On  the  other  hand: 

9*  (Z2)  =  j  f  (zi )  p  (zi  |Z2  )  dzi 

//(zi)p(zi,Z2)dzi 

fp{zuZ2)dzi 

where  p(ai,Z2)  is  the  joint  probability  density  of  Zi  and 
Z2.  The  following  lemma  can  be  used  in  order  to  express 
g*  (z2)  in  terms  of  Z2  and  its  probability  density. 

Lemma  3,1:  /  {zi)p (zi,  Z2)  can  be  expressed  in  terms  of 
Z22  and  the  joint  probability  density  of  zi  and  Z2  in  the  fol¬ 
lowing  form: 

/  (zi) p (zi, Z2)  =  Z22P (zi, Z2)  +  (zi, Z2) .  (3.8) 


3  An  Alternative  Form  for  the  Performance  Index 

In  this  section,  the  performance  index  is  rewritten  in  terms 
of  the  Fisher  information  matrix,  which  indicates  that  the 
cost  may  not  be  convex  in  the  strategies. 


Proof: 

Z22P(Z1,Z2)  + 

=  Z22P(zi,Z2)  +  g^p(Z2|Zl)p(Zi) 


dZ22  \27r€ 

=  /  (^l)  p  (■^1>  ^2)  ) 


(2:21  -  Zl) 

exPl-' - 2^ 


p{zi) 


(-222  -/(Zl)) 
2 


P(Zl) 

(3.9) 


where  we  have  used  the  specific  form  of  the  information 
available  to  the  second  station  and  the  fact  that  evt  ~ 
(0,  e^)  and  V2  ~  A/’(0, 1)  are  independent. 


Substituting  for  /  (2i)p  (21,22)  from  the  above,  lemma 
back  in  (3.7)  and  integrating  with  respect  to  zi,  we  will  ob¬ 
tain  p*  (Z2)  as  follows; 

p*  (22)  =  222  +  Q —  l*iP  (2^2)  •  (3.10) 


On  the  other  hand,  we  have: 

E[z|2\=E[{f{z^)f]+l,  (3.11) 

and: 

[Q 

222g^lnp(2'2) 

+00  g 

Z22  T: - (p  (^21  iZ22))P  (^21 5  2:22)  d2?2ld2:22-  (3. 12) 

.00  dZ22 

If  we  integrate  by  parts  with  respect  to  Z221  we  will  get: 


The  subscript  /  indicates  the  fact  that  it  actually  depends  on 
the  form  of  the  strategy  /,  which  is  present  in  the  defini¬ 
tion  of  Z2  and  would  affect  its  probability  density  function. 
As  we  see»  the  cost  is  now  expressed  only  in  terms  of  one 
strategy  /.  Also,  this  somehow  shows  us  that  in  order  to 
minimize  the  cost,  we  need  to  get  the  lowest  possible  cost 
associated  with  the  first  station,  while  we  transfer  as  much 
information  as  possible  to  the  second  station  through  the 
dynamics  of  the  system.  The  possible  non-convexity  of  the 
cost  with  respect  to  /  can  also  be  seen  from  the  above  ex¬ 
pression.  It  can  be  shown  that  the  Fisher  information  term  is 
a  convex  functional  [3].  Therefore,  1  —  //  (^2)22  concave 
and  the  sum  of  a  convex  and  a  concave  functional  may  not 
be  convex. 


4  Limit  Cases 
4.1  Noiseless  Transmission 

We  first  consider  the  limit  case  in  which  the  transmission 
is  noiseless,  i.e.,  €  =  0  and  hence  Z21  =  zi.  In  this  case, 
the  second  station  knows  exactly  what  the  first  station  knew. 
Therefore,  we  have  perfect  recall  and  the  information  pat¬ 
tern  is  classical.  We  can  write: 

p  {z2)=p  (2:211 2:22)=?  (2:22  12:21 )  P  {^21)— P  {^22  \zi )  p  (2:1) 

=  (4.1) 

Then,  from  (3.10),  we  will  have; 


/+O0  Q 

222  -S - frl  (P  (^21 , 222))  P  (221 ,  222)  <i222 

.00  ^^22 

/+00 

P  (221, 222)  dZ22 

-00 

=  — P(22i),  (3.13) 

where  Z22  is  assumed  to  have  a  finite  mean  value  and  there¬ 
fore  the  first  term  becomes  zero.  Hence;. 

£  [z22^^hip(Z2)]  = -1.  (3.14) 

We  can  now  obtain  E  ^(p*  (22))^]  and  substitute  it  back 
in  (3.5)  to  express  the  performance  index  in  the  following 
form: 

J'*(/)  =  k'^E  [(zi  -  /  (zi))*j  -I- 1  -  //  (^2)22-  (3-15) 

where  If  (-^2)22  indeed  the  (2,2)  element  of  the  Fisher 
information  matrix^  for  Z2,  which  is  defined  as  follows: 

{Z2)  =  E  [Vjj  lnp(z2)  ■  Vx2  lnp(22)] .  (3.16) 

^Fisher  information  is  originally  obtained  in  the  Cramer-Rao  bound, 
which  is  a  measure  for  the  minimum  error  in  estimating  a  parameter  based 
on  the  value  of  a  random  variable.  However,  by  introducing  a  location 
parameter,  an  alternative  form  of  the  Fisher  information  may  be  detined 
for  a  random  variable  with  a  given  distribution.  This  alternative  form  is  in 
feet  related  to  the  entropy  measure  (see  [4],  p.494). 


P*(^2)  =  /(2i)  =  /(22x),  (4.2) 

which  could  easily  be  obtained  from  the  original  definition 
for  p*,  i.e.,; 

9*{z2)  =  E[f{z,)\z2]  =  f{zJ,  (4.3) 

because  zi  is  exactly  known  given  Z2.  Substituting  this  back 
in  (3.5)  and  minimizing  with  respect  to  the  strategy  /,  we 
will  have: 

S’(22)  =  /(2i)=2i,  (4.4) 

and  hence: 

7i(zi)  =  0  (4.5) 

lf2{Z2)  =  zi,  (4.6) 

which  is  the  unique  linear  set  of  optimal  strategies.  This 
indeed  turns  out  to  be  a  very  simple  example  of  the  well- 
known  LQG  problems. 

4.2  Infimte  Transmission  Noise  Intensity 
Another  limit  case  is  when  the  transmission  noise  intensity 
increases  to  infinity.  In  this  case,  Z21  and  Z22  indeed  become 
independent  and  we  will  have: 

P  (^2)  =  P (^2ii ^22)  =  P  (^21) P (^22)  •  (4*7) 


The  Fisher  information  term  can  now  be  written  as: 


If  (^2)22  = 

j  j  ^^~^^P(^21.'2^22)^PU21)^22)<i2:2ld222 

r+°°' /  d 


=  If  [Z22) ) 


(4.8) 


which  is  actually  the  Fisher  information  content  of  222  only. 
Hence: 

J*(/)  =  eS  [(21  -  f  (21))"]  +1-7/  (^22).  (4.9) 

This  is  the  same  result  that  was  presented  for  the  Witsen- 
hausen  counterexample  in  [8].  Intuitively,  when  we  have 
infinite  transmission  noise,  we  might  as  well  deny  the  ac¬ 
cess  to  2i  for  the  second  station,  which  is  exactly  the  case  in 
Witsenhausen’s  counterexample.  The  optimal  strategies  for 
this  case  are  still  unknown.  Wtsenhausen  showed  that  the 
optimal  solution  exists,  even  if  xo  has  a  general  distribution 
with  a  finite  second  moment  [8] .  He  then  showed  that  if  one 
of  the  strategies  is  restricted  to  be  affine,  the  other  optimal 
strategy  would  also  be  affine.  But  then  he  provided  a  set  of 
nonlinear  strategies  that  could  achieve  a  lower  cost  for  some 
values  of  and  ctq.  Different  approaches  have  been  taken 
in  order  to  find  the  optimal  strategies.  The  asymptotic  ap¬ 
proach  was  used  in  [2]  for  the  case  where  <to  is  small.  In  [1], 
a  neural  network,  trained  by  stochastic  approximation  tech¬ 
niques,  was  used  in  order  to  approximate  the  optimal  strate¬ 
gies.  It  was  demonstrated  that  the  optimal  f*  (21)  may  not 
be  strictly  piecewise,  as  was  suggested  by  Witsenhausen, 
but  slightly  sloped.  Some  researchers  have  tried  to  attack 
the  problem  numerically  and  use  some  sample  and  search 
techniques  to  find  the  solution.  A  discretized  version  of 
the  problem  was  formulated  in  [5],  which  was  later  shown 
in  [7]  to  be  NP-complete  and  computationally  intractable. 
It  is  recently  asserted  in  [6]  that  a  global  optimum  would  be 
achieved  by  searching  directly  in  the  strategy  space  using 
the  generalized  step  functions  to  approximate  /  (21). 


longer  classical  and  the  cost  may  no  longer  be  convex  in 
the  strategies.  Hence,  the  optimal  strategies,  which  may  not 
even  be  unique,  are  very  difficult  to  find.  Similar  approaches 
that  have  so  far  been  used  for  the  Witsenhausen  problem, 
might  be  applied  to  this  new  formulation  as  well.  For  exam¬ 
ple,  asymptotic  approaches  using  expansions  in  small  e  are 
possible. 
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5  Concluding  Remarks 

A  simple  example  of  a  decentralized  control  problem  with 
noisy  information  transmission  was  investigated.  This  ex¬ 
ample  is  a  reformulation  of  the  Witsenhausen  counterex¬ 
ample  by  allowing  the  first  station  to  send  its  information 
to  the  second  station  through  an  additive  white  Gaussian 
noise  channel.  In  fact,  we  show  that  Witsenhausen’s  origi¬ 
nal  counterexample  can  be  seen  as  a  limit  case  in  this  new 
formulation.  We  believe  that  this  new  formulation  is  closer 
to  many  applications  in  large  scale  systems,  where  differ¬ 
ent  pieces  of  information  could  be  transmitted  among  the 
stations  through  some  noisy  channels.  We  should  note  that 
as  soon  as  some  kind  of  communication  uncertainty  is  in¬ 
troduced  for  the  transmission,  the  information  pattern  is  no 
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Abstract 

A  reformulation  of  the  Witsenhausen  counter-example  is 
considered,  where  the  first  station  is  allowed  to  transmit  its 
information  to  the  second  station  through  a  low  noise  chan¬ 
nel.  This  is  in  fact  a  decentralized  stochastic  system  where 
the  communication  uncertainty  induces  a  non-classical  in¬ 
formation  pattern.  Assuming  a  small  transmission  noise 
intensity,  an  asymptotic  approach  is  used  in  order  to  find 
an  approximated  cosL  Then,  the  necessary  conditions  for 
asymptotically  optimal  strategies  are  obtained  using  a  vari¬ 
ation^  approach.  It  is  shown  tiiat  the  necessary  conditions 
are  satisfied  by  linear  strategies  with  slightly  different  coef¬ 
ficients  than  the  noiseless  transmission  case. 

1  Introduction 

Coordinating  and  controlling  dynamic  systems  in  spatial 
networks  has  always  been  a  challenging  problem  for  sys¬ 
tem  designers.  It  is  now  attracting  more  attention  as  various 
new  applications  are  emerging  in  a  very  wide  range,  fronri 
controlling  autonomous  vehicles  in  formation  to  flow  and 
congestion  control  in  computer  networks.  However,  there 
are  still  some  major  difficulties  in  dealing  with  such  sys¬ 
tems. 

The  main  characteristics  of  any  decentralized  system  is  that 
the  information  is  distributed  among  different  stations  and 
the  performance  of  the  system  highly  depends  on  the  cor¬ 
responding  information  pattern,  i.e.,  who  knows  what  and 
when.  The  stations  may  communicate  with  each  other,  pos¬ 
sibly  by  signaling  through  noisy  channels.  Even  though 
there  might  be  some  physical  constraints  on  the  informa¬ 
tion  structure  of  the  system  (e.g.  locations  of  the  sensors, 
die  actuators,  and  the  transmitters),  in  general,  an  optimal 
information  pattern  should  be  obtained.  Then,  based  on 
the  locally  available  information,  a  set  of  coordinated  local 

^This  research  was  sq^ported  in  part  by  the  National  Science  Foun- 
' '  dadon  under  Grant  ECS-9TO945,  Air  Force  Office  of  Scientific  Research 
y  tinder  Grant  F49620-97- 1-0272  and  Office  of  Naval  Research  under  Award 
N00014-97- 1-0939 

^7803-5250-5/99/$  10.00  ©  1999  I6EE 


strategies  should  be  designed  in  order  to  achieve  a  common 
objective.  In  many  cases,  however,  we  will  end  up  with  non- 
convex  functional  optimization  problems,  which  are  usually 
very  difficult  to  solve. 

One  such  class  of  problems  is  when  the  decentralized  sys¬ 
tem  has  a  non-classical  information  pattern  which  is  not  par¬ 
tially  nested.  In  this  case,  some  stations  can  not  reconstruct 
the  previous  actions  of  other  stations  which  have  affected 
their  own  local  information.  Unfortunately,  this  happens  in 
many  decentralized  systems. 

In  1968,  Witsenhausen  provided  a  simple  example  in  [6], 
where  there  arc  only  two  stations,  the  dynamics  are  linear, 
the  underlying  uncertainties  arc  additive  and  Gaussian  and 
the  cost  is  quadratic.  The  information  pattern,  however,  is 
non-classicd.  This  example,  which  demonstrates  some  of 
the  major  ffifficulties  in  dealing  with  non-classical  informa¬ 
tion  patterns,  motivated  much  research  on  the  links  between 
decentralized  stochastic  control  problems  and  team  theory 
and  the  effects  of  different  information  patterns  on  decen¬ 
tralized  systems. 

In  this  example,  one  station  acts  first  and  affects  the  infor¬ 
mation  available  to  the  next  station  while  there  is  no  way  for 
the  second  station  to  determine  the  action  of  the  first  station. 
The  existence  of  the  optimal  design  was  established  in  [6], 
where  a  nonlinear  set  of  strategics  was  also  proposed  which 
showed  that  no  affine  strategy  could  be  optimal.  It  was  later 
shown  in  [3]  that  when  the  uncertainty  on  the  information 
available  to  the  first  station  is  small,  linear  strategies  would 
still  be  optimal  over  a  large  class  of  nonlinear  strategies.  In¬ 
tuitively,  when  the  uncertainty  on  the  information  of  the  first 
station  is  small,  the  second  station  will  also  be  able  to  guess 
what  that  information  was.  Therefore,  since  the  problem  is 
cooperative  in  the  sense  that  the  stations  are  aware  of  each 
others’  strategies,  the  second  station  can  almost  reconstruct 
the  action  of  the  first  station  and  there  is  no  need  for  any 
kind  of  signaling  among  the  stations  through  the  dynamics 
of  the  system. 
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g{Z2)  =  72(22)  =  «2- 

Then  the  cost  can  be  expressed  as: 

J  =  E  +  a:i] 

=  E  {zi  -  f  (zi))^  +  (/  (a)  -  9  (^2))  ] 
i  J{f,9).  (2.10) 

If  we  fix  the  function  /,  the  optimal  strate^  9  will  clearly 
be  obtained  as  the  conditional  expectation,  i.e.,: 

g*  [Z2)  =  argmin  J (/,p)  =  E[f  (zi)  IZ2]  •  (2.11) 

It  was  shown  in  [5]  that: 

p*  (Z2)  =  -^22  +  ^  liiP  («2)  •  (2.12) 

where  p  (22)  =  P  i^zi,  Z22)  is  the  probability  density  func¬ 
tion  for  the  information  available  to  the  second  station.  It 
was  further  shown  that  the  cost  can  be  written  as  the  follow- 
ing  and  may  not  be  convex  in  / : 

nf)  = 

=  k'^E  [(zi  -  /  (zi)f  ]  +  E  [(/  (zi)  -  9*  (Z2))^] 

=  k^E  [(zi  -  /  (zi )f]  +1- If  (^2)22,  (2. 14) 

where  If  (^2)22  (2, 2)  element  of  the  Fisher  informa¬ 

tion  matrix  for  Z2,  which  is  defined  as: 


to  find  the  corresponding  expansion  for  g*  (Z2).  By  substi¬ 
tuting  back  in  (2. 13),  we  will  obtain  the  expanded  cost  only 
in  terms  of  /. 


The  probability  density  function  for  Z2  can  be  written  as 
follows: 


ri-co 

PAzz)  =  p(«2)=  /  p{Z22,Z21,Zi)dZi  (3.1) 
J-00 

/-f*oo 

P  {z22\Z21y  Zi)p  {Z2l\Zl)  P  {Zi)  dZi  (3.2) 

-00 

/+00 

P  (.Z22\Zl)  P  (Z21IZ1)  P  (Zl)  dzi 
-00 

/+00 

p  (2!22l^'l)  Pvt  (-*'21  -  ^l)  P  (^1)  <^^1 

'OO 


vk— (■ 


{Z22-f{Zl)y 

2 


(Z21  -  Zl) 
2e2 


■)v^ao*^(  2<^) 


(3.3) 

(3.4) 


dzi,(3.5) 


where  for  (3.3)  we  have  used  the  facts  that  the  tr-ficlds  gen¬ 
erated  by  •[Z21,  Zl}  and  {zi ,  u*}  are  the  same  and  zi,  vt  and 
V2  are  mutually  independent.  For  small  e,  we  now  approx¬ 
imate  Inpe  (Z2)  by  considering  only  the  first  three  terms  of 
its  expansion  around  c  =  0.  Namely: 


d 


In  Pc  (Z2)  cilnpo  ('*'2)+^  lnp<  (^2) 


€+^lnp*(Z2)  €^. 

.=0 


By  making  the  following  change  of  variables: 

ey  =  zi  —  Z21  =>  edy  =  dzi, 


im 

(3.7) 


If  {Z2)  =  E  [Vfj  Inp  (Z2)  •  V„  Inp  (Z2)] .  (2. 15) 

As  we  mentioned  earlier,  for  the  noiseless  transmission 
case,  the  unique  optimal  strategies,  which  are  linear  in  the 
information,  are  easily  obtained.  On  the  other  hand,  when 
the  transmission  noise  intensity  e  is  small,  we  would  still 
expect  a  similar  behavior  for  the  optimal  sttategies.  In  the 
following  sections,  we  will  consider  this  case.  Namely,  we 
will  assume  vt  has  a  small  intensity.  Under  this  assumption, 
we  will  obtain  the  first  few  terms  in  the  expansion  of  the  cost 
in  terms  of  e.  We  will  then  use  the  Hamiltonian  approach  in 
order  to  find  the  necessary  conditions  for  the  strategies  that 
minimize  the  approximated  cost 

3  An  Expansion  for  the  Cost 


we  can  write  Pc  (Z2)  in  the  following  form: 


f+°°  1 


V^<7’0 

where: 


exp  -- 


{Z2\ 


/«(y)  =  /(cy  +  i'2i). 


(3.9) 


It  is  now  clear  that: 
Po(z2)= 


\/27r 

and  hence: 


1  /  (Z22-/ (Z2l))^\  1 

^  (3.10) 


A.ssume  that  the  first  station  communicates  with  the  second 
station  through  a  low  noise  channel.  In  other  words,  the 
transmission  noise  intensity  e  is  assumed  to  be  small.  In 
this  section,  we  will  find  an  expansion  for  the  cost  in  terms 
of  c.  Fbr  this  purpose,  we  first  find  an  expansion  for  the 
probability  density  function  of  the  information  available  to 
the  second  station,  i.e.,  p  (Z2).  Then,  we  use  (2.12)  in  order 


Inpo  (Z2)  =  - 


(z22  —  /  (^21))^ 
2 


For  the  first  order  term,  we  have: 


de 


Inpc  (Z2) 


<=o 


Po(z2)  de 


^Pt  {z!2) 


<=0 


(3.12) 
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output  equations: 


This  seemingly  simple  example,  which  has  come  to  be 
called  Witsenhausen’s  counter-example,  turned  out  to  be  ex¬ 
tremely  hard.  It  is  still  outstanding  after  30  years.  However, 
new  emerging  applications  and  the  necessity  of  looking 
back  at  some  fundamental  obstacles  in  designing  decentral¬ 
ized  stochastic  strategics  have  recently  inspired  some  new 
research  on  this  example.  In  [1],  a  neural  network,  uained 
by  stochastic  approximation  techniques,  was  used  in  order 
to  approximate  the  optimal  strategies.  Also  it  was  recently 
asserted  in  [4]  that  a  global  optimum  would  be  achieved  by 
searching  directly  in  the  strategy  space  using  the  general¬ 
ized  step  functions  to  approximate  the  strategies. 


22 


Xq  +  Vt 
*0  +  +  Vj 


221 

222  J  ’ 


(2.3) 

(2.4) 


where  V2  is  the  measurement  noise  for  the  second  station, 
which  is  also  assumed  to  be  a  zero  mean  Gaussian  random 
variable  with  unit  variance.  As  we  can  see,  the  information 
available  to  the  first  station  is  being  transmitted  to  the  sec¬ 
ond  station  and  the  communication  uncertainly  is  modeled 
by  an  additive  Gaussian  noise  ut  ~  A/*  (O,  e^) .  Also,  xo*  va 
and  Vt  are  all  assumed  to  be  independent  of  each  other. 


In  Wtsenhausen’s  problem,  the  non-classical  nature  of  the 
information  pattern  is  a  result  of  the  fact  that  the  information 
available  to  the  first  station  is  completely  inaccessible  for 
the  second  station.  However,  recent  advances  in  computing 
and  communication  technologies  make  it  possible  for  the 
stations  in  many  decentralized  systems  to  communicate  dif¬ 
ferent  pieces  of  information.  But  the  communications  can 
never  be  perfect  and  there  is  always  some  uncertainty  in¬ 
volved.  Unfortunately,  such  uncertainty  will  again  induce  a 
nwi-classical  nature  on  the  information  pattern  of  the  sys¬ 
tem. 

In  [5],  Witsenhausen’s  problem  was  reformulated  in  such 
a  way  that  the  first  station  could  communicate  its  informa¬ 
tion  with  the  second  station  through  a  noisy  channel.  It  was 
shown, that  as  long  as  there  is  noise  in  transmission,  the 
main  difficulties  will  persist  Specifically,  the  cost  might 
still  be  non-convex  with  respect  to  the  strategies.  However, 
when  the  transmission  noise  intensities  are  small,  we  would 
expect  the  optimal  strategies  to  be  very  close  to  the  corre¬ 
sponding  strategies  for  the  noiseless  transmission  case. 

In  the  next  section,  we  formulate  the  problem  and  discuss 
some  of  the  results  obtained  in  [5].  In  Section  3,  we  ap¬ 
proximate  the  cost  by  expanding  it  in  terms  of  the  small 
transmission  noise  intensity.  In  Section  4,  we  use  a  vari¬ 
ational  approach  in  order  to  find  a  necessary  condition  for 
the  strategies  which  minimize  the  approximated  cost.  As 
we  shall  see,  we  will  actually  have  a  singular  optimization 
problem.  We  will  then  show  that  asymptotically  optimal 
strategies  may  still  be  linear  with  slightly  different  coef¬ 
ficients  than  the  corresponding  strategies  for  the  noiseless 
transmission  case.  Finally,  in  the  last  section,  we  will  have 
our  concluding  remarks. 

2  Problem  Description 

Consider  a  two-stage  stochastic  problem  with  the  following 
state  equations: 

xi  =  ®o  +  til  (2.1) 

X2  =  Xi-  ti2,  (2.2) 


It  is  clear  that  we  have  simply  modele^l  the  received  in¬ 
formation  signal  as  the  transmitted  signal  plus  a  Gaussian 
transmission  noise.  While  this  model  is  realistic  for  analog 
communication  systems,  it  may  not  be  well  justified  when 
digital  communication  is  used.  In  digital  communication 
systems,  the  signal  is  quantized,  coded  and  sent  through  the 
channel.  Still,  the  channel  noise  may  realistically  be  as¬ 
sumed  to  be  additive  and  Gaussian,  but  sophisticated  mod¬ 
ulation  and  coding  schemes  make  it  difficult  to  assume  a 
simple  additive  Gaussian  uncertainty  for  the  received  in¬ 
formation  signal.  However,  if  we  try  to  incorporate  the 
quantization  effects  along  with  the  error  probability  distri¬ 
bution  for  some  good  coding  and  modulation  schemes  in  or¬ 
der  to  model  the  communication  uncertainties,  we  will  end 
up  with  models  which  could  still  be  approximated,  to  some 
degree,  by  simple  additive  Gaussian  inodels.  On  the  other 
hand,  since  there  are  already  major  difficulties  in  dealing 
with  decentralized  non-classical  information  patterns,  using 
more  complex  models  for  communication  uncertainties  may 
not  seem  very  reasonable  at  this  point.  Furthermore,  we  be¬ 
lieve  that  the  results  obtained  under  such  a  simplifying  as¬ 
sumption  would  still  serve  as  a  guideline  for  finding  the  true 
nature  of  decentralized  strategies. 

The  objective  is  now  to  design: 

til  =  7i  (21)  (2.5) 

ti2  =  72(22),  (2.6) 

in  order  to  minimize  the  following  cost  function: 

J  =  E  [k\l  +  xl] ,  (2.7) 

where  >  0  is  a  given  constant  We  see  that  the  first 
controller  has  perfect  information  but  its  action  is  costly. 
In  contrast,  the  second  controller  has  inexpensive  control 
but  noisy  information.  Since  the  second  station  does  not 
know  what  the  first  station  knew,  due  to  the  transmission 
noise,  we  don’t  have  perfect  recall  and  hence  we  still  have 
a  non-classical  pattern.  If  there  was  no  transmission  noise, 
we  would  have  a  classical  information  pattern  for  which  the 
unique  optimal  strategies  are  known  to  be  linear  in  the  in¬ 
formation. 


where  xo  is  the  initial  state,  which  is  assumed  to  be  a  zero  For  simplicity,  lets  define: 
mean  Gaussian  random  variable  with  variance  oq*  The  in-  ^ 

formation  pattern  of  the  system  is  specified  by  the  following  f{zi)  =  zi  +  71  (zi )  =  xq  +  tii  (2.8) 
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On  the  other  hand; 


€=0 


1  1-4^ 

J-oo^^\V^  V^o-Q  )\t^^ 

=  j  -/  {^2l))  y/'(«2l) 

V^<To®  ^^/_ooV^^ 

^  (-§)  ie-'^dy  =  0.  (3.13) 

l7r<7o  \  (^0/  y2n 


V^<7o 

Therefore: 


^lnp<(z2) 


=  0. 


(3.14) 


€=0 


This  result  is  not  unexpected,  because  we  would  expect  the 
behavior  of  p«  {Z2)  only  to  depend  on  the  variance  of  the 
Gaussian  transmission  noise,  le.,  Using  (3.14),  we  can 
now  obtain  the  second  order  term  as: 


where  we  have  neglected  the  fourth  order  term  in  e.  Sub¬ 
stituting  this  expansion  back  in  (2.13),  we  will  obtain  the 
following  expansion  for  the  cost: 

r(f)  =  k^E  [(zx  -  /  (zi))“]  +  E  [if  izi)f] 

-E  [if  (221))']  -  2^E[fiz2i)  if'  iz2i) 

4-2/'^  (Z2i)  (Z22  -*  /  (^21 ))  +  2/' (2:21)  (-^21  / O^q)  ) ] .  (3. 19) 

Note  that  when  the  transmission  is  noiseless,  i.e.,  e  =  0  and 
therefore  Z21  =  zi,  we  have: 

J*(/)  =  !:="£;  [(zi-/(zi))"],  (3.20) 

and  /  (zi)  =  zi  is  the  obvious  unique  optimal  solution. 

The  above  expansion,  however,  is  not  exactly  in  our  desired 
form  yet.  This  is  because  the  third  term  on  the  right  hand 
side,  which  is  an  average  over  z,!,  still  depends  on  e.  We 
shall  now  rewrite  the  expansion  in  (3.19)  by  explicidy  ex¬ 
pressing  the  expectations  based  on  the  corresponding  prob¬ 
ability  densities: 
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(3.15) 
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After  some  tedious  but  straightforward  manipulations,  we 
will  get 


^lnp.(^2)L= 

’'/^^(^2l)+/'^(Z2l)(Z22’-/  (^2l))+/^(^2l)(^22-/  (^2l))^ 

+2/'  (Z2x)  iZ22-f  iZ2l))  ^  (3.16) 

We  can  now  obtain  a  second  order  approximation  for 
lnp<  (za)  by  substituting  the  corresponding  terms  from 
(3.1  IX  (3.14)  and  (3.16)  back  into  the  expansion  (3.6).  In 
the  next  step,  we  substitute  the  expansion  for  Inp*  (za) 
in  (2.12)  in  order  to  find  the  corresponding  expansion  for 
g*  (za).  Remember  that  g*  (za)  is  the  optimal  strategy  for 
the  second  station  assuming  that  the  first  station  has  a  fixed 
strategy  7i  (zi)  =  /  (zi)  -  zi.  We  have: 


=  [l:M<-/(f))*+/*(f)]  -^e'4df 

- [/*(<)  +  26*  (/(*)/"(*)  -  2/(f)/'(f)^)] 

Y27r  (ctq  4- c^)  «/— oo»/— 00  . 


where  we  have  substituted  p  (Z2)  =  p (Z22,  Z21)  cz  po  (Z2) 
in  the  third  term,  since  the  higher  order  terms  would  be  mul¬ 
tiplied  by  ^  and  then  would  be  neglected.  Now,  the  third 
term  turns  out  to  be  zero,  because: 


46*/(f)/'*(f)^^e  5^. 

-oo  v27r(7o 

./(<)) 

V2n 


(/!-: 


^  df  =  0.  (3.22) 


Inp  (za)  zaa  +  •= —  Inpo  (za) 
OZ22 

)  =Z22-iZ22-fiZ2l))  + 
l*=0/ 

6*[r(x;2i)+2/'*(za0(zaa-/(^,))+2/'(zax)(-^)].(3.17) 

Our  goal  is  to  get  an  expansion  for  the  cost,  which  is  in  the 
form  (2.13).  Using  the  expansion  for  g*  (za)  from  (3.17), 
we  will  have: 

E  [(s*  (za))’]  ci  E[if  (zsx))’]  +  2£*£;  [/(zai)(r  (zax) 
-1-2/'*  (zai)  (zaa  — /  (^21))+  2/'  (zai)  (— z!ai/<r*))]  ,(3.18) 


9*  iz2)  =  Z22  + 


dZ22 


On  the  other  hand,  we  can  expand  the  probability  density  of 
Zax  up  to  the  second  order  in  e: 
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Substituting  (3.22)  and  the  above  expansion  back  in  (3.21) 
and  neglecting  the  higher  order  terms  in  c,  we  can  finally 
get  the  following  expansion  for  the  cost: 


rif)  = 


dt 
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+6^ /''*[4/(f)/' W  ^  -2/(t)/"(<) 

,  ^  Jo*  +  e"  Ji*.  (3.24) 

^  ^  J  v^ao 

The  objective  is  now  to  obtain  the  function  /  which  mini¬ 
mizes  the  above  approximated  cost.  In  the  next  section,  we 
will  use  a  variational  approach  in  order  to  find  the  necessary 
conditions  for  such  a  function. 


4  Minimiring  the  Approximated  Cost 


So  far,  we  have  obtained  an  expansion  for  the  cost  assum¬ 
ing  that  die  transmission  noise  intensity  is  small.  We  have 
approximated  the  cost  by  including  only  up  to  the  ^nd 
order  term  in  c.  We  should  now  try  to  •minimize  this  ap¬ 
proximated  cost  and  find  the  optimal  /*.  Obviously,  the 
corresponding  optimal  strategy  would  be  valid  o«(y  ® 
small  transmission  noise  intensity.  However,  it  would  sliU 
be  very  helpful  for  the  analysis  of  the  behavior  of  the  oph- 
mal  strategics  when  we  deviate  a  litde  bit  from  the  cl^ical 
information  pattern  by  introducing  a  small  communication 
uncertainty. 


We  now  use  the  Hanultonian  approach  in  order  to  find  the 
necessary  conditions  for  the  function  /(*),  which  minimizes 
our  approximated  cost.  For  simplicity,  let’s  denote; 


Xl  {t)  =  m,  X2  (f)  =  Xl  (f)  =  /'(f) 

u(t)  =  i2(f)  =  2i(f)  =/"(*).  ^ 


The  Hamiltonian  is  then  defined  as  [2]; 


Substituting  p(f)  =  -^P(*)  ^ 

will  get 

=  -26^X2  (f)p(f)  -  2e^®i(f)^p(f)  -  ^i(f)  =  (*• 

“  (4.6) 

Differentiating  again  and  substituting  Ai  from  (4.2),  we  will 
have; 

— W„=  f-4€*ti(t)  +  4^-^X2  (f)  -2fc*  (f  -  Xl  (f)))  p(f) = 0. 

df*  \  ^0 

Therefore,  the  corresponding  «(f )  on  the  singular  surface  is: 


u(f)  =  X2(f)^ 

Note  that  the  first  order  generalized  Legendre-Clebsch  cori- 
(fition,  which  is  a  necessary  condition  for  t»(f)  to  be  mini¬ 
mizing  on  the  singular  surface,  is  also  satisfied,  namely: 


flu 


<0, 


(4.9) 


Therefore,  the  corresponding  xi  (f)  and  X2(f),  which  mini¬ 
mize  our  approximated  cost,  should  necessarily  satisfy  the 

following  differential  equations: 

ii(f)  =  X2(f)  (4-10) 

X2(f)  =  X2(f)^-^(f-®i(*))  (4.11) 

Since  c  is  assumed  to  be  small,  we  may  assume  the  fol¬ 
lowing  form  in  order  to  obtain  the  solutions  for  the  above 
differential  equations: 


U=k^{t-xi  (f )  f  P(f )  +  c*  ^4x1  (f  )X2  (f )  ^  -  2xi  (f  )u(f ) 

+a.2(f)d:^V(:)  -f  Ai(f)x2{f)  +  A2(f)«(f),  (4.1) 

<^0  / 

where  Ai  and  A2  are  the  Lagrange  multipliers  which  should 
satisfy: 

Ai(f)  =  =  ^2fc=  (f-xi(f))  -  ^X2{t)^ 

-2^xi{t) +  2M*))  Pit)  (4-2) 

(^0  / 

A2  (f)  =  -  -4c“®i  (f )  ^Pit)  -  ■^1  (*)• 

But  as  we  can  see,  the  HamUtonian  is  linear  in  u(f)  ^d  we 
actually  have  a  singular  optimization  problem.  The  singular 
surface  will  be  characterized  by  setting  Hu  and  its  deriva- 
fives  with  respect  to  f  equal  to  zero,  that  is. 

Hu  =  -2€*Xi  (f)p(f)  +  A2(f)  =  0,  (4.4) 

and; 

±nu  =  -2€"xi(f)p(f)-2c"xi(f)p(f)+A2(f)  =  0.  (4.5) 
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xi(f)  =  ao(f)  +  e*02(f)  +  e*04(f)  +  ...  (4.12) 
X2(f)  =  60(f) +  €*b2(f)  +  e*Mf)  +  ---  (4-13) 

Interestingly  enough,  by  substituting  the  above  xi  and  X2 
back  into  die  differential  equations  and  comparing  toe  coef¬ 
ficients  of  toe  terms  with  toe  same  order  in  e,  we  will  get: 


Back  to  our  original  notation,  we  indeed  have: 


As  we  can  see,  toe  solution  is  stiff  linear  with  a  coefficient 
which  is  slighUy  different  than  toe  corresponding  coeffi¬ 
cient  for  toe  noiseless  transmission  case.  Reoiember  that 
/  (xi)  =  zi  is  toe  optimal  solution  when  there  is  no  trans¬ 
mission  noise  and  note  that  for  c  =  0  in  (4.15),  we  get 
exactly  the  same  solution,  as  we  would  expect  Also  note 
that  as  toe  value  of  fc’trg  increases,  toe  above  soI^ot 
proaches  /  (xi)  =  Xi.  In  other  words,  increasing  k^Po  has 


an  effect  sinular  to  decreasing  the  communication  uncer¬ 
tainty  Given  the  above  function  /  (zi).  the  corresponding 
^  M  can  easily  be  obtainedming  (2.11).  Note  thatn  wdl 
Sli  te  linear  bLuse  ot  the  Oausiiian  assumpoon  for  Ihe 
underlying  uncertainties. 

In  fact,  we  would  expect  the  optimal  strategies  to  be  hn- 
^  m  Z  mentioned  in  Section  1.  linear  strategies  were 
S^wn  to  be  asymptotically  optimal  for  the  Witsenhau^n 
example  when  the  uncertainty  of  the  mformanon  available 
to  the  first  station  is  small  [3].  In  this  p^r,  ^ 

LeconsideredareformulationofWitsenhausenspmbl^^^ 

where  the  first  station  sends  its  informauon  to  the  second 
Ttadon  through  a  low  noise  channel.  Ibese  two  scenario  are 

somewhat  similar.  Namely,  in  both  scenarios  *e^o^ 

station  can  determine  the  information  available  to  the  first 
station  feirly  accurately.  Specifically,  m  the  tot  scenj>no. 
the  second  station  almost  knows  because  of  its  small  un¬ 
certainty.  while  in  the  second  scenario,  it  can  determine  zi 
from  the  information  that  is  transmitted  through  a  low  noise 

channel. 

We  would  also  expect  the  optimal  strategies  to  approach  the 
corresponding  strategies  for  the  noiseless  transmission  case 
as  tovalue  of  zi  and,  in  some  sense,  the  signal  to  noise 
ratio  increases.  This  doesn’t  seem  to  happen  in  the  solution 
(4  15).  One  may  justify  this  by  looldng  at  the  exponen^ttd 
function  in  the  cost  This  function  drives  the  of 

the  cost  to  zero  exponentially  fast  for  large  zi  .  Therefore, 
the  structure  of  the  cost  does  not  force  the  optimal  soluuon 
to  approach  /  (zi)  =  zi  as  zi  increases. 

Substituting  m  from  (4.15)  back  into  the  cost  (3.24).  we 
obtain  the  corresponcting  value  of  the  cost: 

=  777^7 

(4.16) 

The  optimal  cost  for  the  noiseless  transmission  case  is  zero. 
But  if  we  use  /  (zi)  =  zi  when  the  transmission  is  noisy, 
we  get  the  following  cost: 

J*(/)  =  2^.  (^17) 

In  other  words,  if  we  fix  the  strategies  to  be  the  opti^ 
strategies  for  the  noiseless  transmission  case  while  we  intr^ 
duce  a  small  transmission  noise,  the  increase  in  the  cost  will 
be  proportional  to  the  transmisaon  noise  intensity.  How- 
evi  if  we  use  (4.15),  we  can  indeed  improve  the  cost  by 
the  fourth  order  in  c. 

5  Concluding  Remarks 

We  analyzed  an  example  of  a  decentralized  stochastic  sys¬ 
tem.  This  example  was  a  reformulation  of  the  ^“sen- 
hausen  counter-example  where  die  tot  station  was  allovve 
to  send  its  information  to  the  second  station  through  a  noisy 
channel.  The  dynamics  were  linear,  all  the  underlying  un¬ 
certainties  were  assumed  to  be  Gaussian  and  the  cost  was 


Quadratic.  However,  the  presence  of  the  communicanon  un¬ 
certainty  had  generated  a  non-classical  mformanon  pattern. 
Therefore,  in  general,  we  would  have  a  non-convex  func¬ 
tional  optimization  problem. 

We  considered  the  case  where  the  communication  uncer¬ 
tainty  was  small.  We  followed  an  asymptooc  approach 
where  we  approximated  the  cost  based  on  its  e^Ifision  m 
terms  of  the  small  transmission  noise  intensity.  We  showed 
how  minimiang  the  approximated  cost  can  be  seen  as  a 
singular  optimization  problem.  We  then  used  a  vanauonal 
approach  in  order  to  find  the  necessary  conditions  for  the 
asymptotically  optimal  strategies  and  showed  that  some  ka- 
soLble  linear  strategies  would  actually  satisfy  those  ^  - 
tions.  We  also  provided  some  intuitive  explanauons  for  the 
behavior  of  those  linear  strategies  and  obtained  their  corre¬ 
sponding  cost 

AU  the  derivations  and  the  results  in  this  paper  show  rome 

of  the  difficulties  involved  in  dealing  with  decentralized  sys¬ 
tems  as  soon  as  we  deviate  a  tittle  bit  from  a  ctesical,  or  at 
least  a  partially  nested,  information  pattern.  On  the  offier 
hand,  even  though  we  have  modeled  the  commumc^on 
uncertainty  in  the  simplest  possible  way,  we  have  ttied  to 
emphasize  the  role  of  communication  uncertaintiM  in  gm- 
erating  such  information  patterns  that  are  very  difficult  to 
handle. 

Finally,  even  though  the  optimization  problem  is  generally 
difficult  for  this  class  of  systems,  in  some  apptiiations  we 
might  be  able  to  exploit  the  specific  structure  of  the  system 
in  order  to  obtain  some  reasonably  good  sup-opomal  strate¬ 
gies,  which  would  yield  an  acceptable  performance. 
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Abstract 

We  consider  a  two-station  decentralized  Linear  Quadratic 
Gaussian  problem,  where  the  stations  are  allowed  to  com¬ 
municate  some  pieces  of  information.  We  investigate  a  pos¬ 
sible  sub-optimal  approach  where  the  controls  are  obtained 
based  on  two  separate  centralized  problems.  Various  cases 
will  be  considered  in  which  Ae  two  stations  communicate 
Aeir  measurements,  Aeir  controls  or  estimates,  and  Aeir  es¬ 
timation  residuals  through  noisy  channels.  We  will  mainly 
focus  on  Ae  closed-loop  stability  properties.  We  will  show 
Aat  even  if  Ae  stations  communicate  all  Aeir  measurements 
through  low  noise  or  even  noiseless  channels,  Ae  controls 
obtained  from  Ae  two  centralized  LQG  problems  may  fail 
to  stabilize  Ae  closed-loop  decentralized  system. 

1  Introduction 

One  of  Ae  most  challenging  problems  for  control  engineers 
is  to  design  controllers  for  large  scale  decraiualized  sys¬ 
tems  which  ate  composed  of  a  large  number  of  sp^ally 
Astributed  interconnected  subsystems.  Uninhabited  Air  Ve¬ 
hicles  (UAV’s)  flying  in  formation  and  Automated  Vehi¬ 
cles  driving  in  platoons  arc  two  examples.  Also,  recent  ad¬ 
vances  in  computing  and  communication  technologies  have 
introduced  many  new  applications  where  dynamic  systems 
would  form  spatial  networks.  Howevw,  Aere  are  still  ma¬ 
jor  Afficulties  in  designing  controllers  for  such  systems  Aat 
could  achieve  some  specified  level  of  performance. 

It  is  always  difficult  to  find  decentralized  stabiliang  con¬ 
trollers  as  soon  as  Ae  system  has  unstoble  fixed  modes  [3]. 
Incorporating  uncertainties  makes  Ae  problem  even  more 
difficult  This  can  be  seen  in  a  seemingly  simple  counter¬ 
example  introduced  by  Witsenhausen  in  1968,  whose  solu¬ 
tion  remains  an  open  problem  today.  Witsenhausen  showed 
Aat  finAng  Ae  optimal  decentralized  strategies,  even  for 

•This  reseirch  wis  supported  in  psrt  by  the  National  Science  Foun¬ 
dation  under  Grant  ECS-9502945,  Air  Force  Office  of  Sdentiec  Research 
under  Grant  F49620-97- 1-0272  and  Office  of  Naval  Research  under  Award 
N00014-97-l-(W39 


a  very  simple  two-stage  problem  wiA  linear  dynamics, 
Gaussian  uncertmnties  and  quadratic  cost,  could  be  ex¬ 
tremely  hard  as  soon  as  Ae  information  pattern  becomes 
non-classical. 

In  defimng  a  decentralized  linear  quadratic  Gaussian  prob¬ 
lem,  we  will  assume  all  stations  have  linear  dynamics  and 
all  uncertainties  are  modeled  as  Gausaan  proce^.  More¬ 
over.  each  local  controller  only  has  access  to  its  own  lo¬ 
cal  information,  which  includes  its  own  measurements  md 
possibly  information  received  through  communication  wA 
oAer  stations.  Such  a  decentralized  nature  of  information 
generally  induces  a  non-classical  information  pattern  for 
Ais  Qiagg  of  problems.  Therefore,  except  for  some  spe¬ 
cial  structures,  where  Ae  information  pattern  is  actually  a 
classical  pattern  [2],  Ae  optimal  strategies  are  usually  un¬ 
known.  Some  sub-optimal  tqiproaches ,  however,  might  be 
proposed.  One  such  approach  is  to  treat  Ae  problern  as 
a  collection  of  separate  centralized  problems.  A  motiva¬ 
tion  for  this  approach  would  become  clearer  if  we  assume 
Aat  each  sAtion  is  allowed  to  communicate  all  its  me^ure- 
ments  through  low  noise  communicatior.  channels  wiA  all 
Ae  oAer  sAtions.  Even  Aough  a  huge  burden  of  compu- 
Ation  and  commumcation  resourc<*t  iuay  be  needed  in  Ais 
scenario,  we  would  expect  Ae  controllers  to  be  very  close 
to  Ae  optiAal  sAbilizing  decentralized  controllers. 

In  Ae  next  section,  we  formulate  a  simple  two-sAtion  de¬ 
centralized  LQG  problem.  In -Section  3,  we  Ascuss  Ae 
above  mentioned  sub-optimal  approach,  where  we  propose 
a  solution  based  on  two  separate  centralized  problems.  In 
Section  4,  we  mvestigate  Ae  stability  properties  of  our  con¬ 
trollers  m  various  scenarios.  Namely,  we  first  consider  Ae 
case  where  Ae  sAtions  do  not  commumcate  at  all.  Then, 
we  assume  Aat  Ae  sAtions  can  communicate  Aeir  sAte  es¬ 
timates  or  equivalently  Aeir  controls.  In  Aese  scenarios,  as 
we  shall  see,  Aere  is  litfle  justification  for  our  approach.  But 
later,  we  will  discuss  Ae  case  where  Ae  stations  are  allowed 
to  communicate  Aeir  measuremente.  As  we  mentioned,  our 
approach  seems  very  rea.sonable  for  Ais  scenario,  at  least 
when  Ae  transmission  noise  intensities  are  assumed  to  be 
jcmall  However,  as  one  of  Ae  main  resulA  in  this  p^r. 
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we  will  show  that  even  in  this  case,  our  controllers  may  fail 
to  stabilbce  the  closed-loop  system.  This  clearly  contradicts 
our  expectation.  In  another  scenario,  the  stations  will  be 
allowed  to  communicate  both  their  measurements  and  their 
controls.  We  will  see  how  sending  the  controls  will  help 
us  achieve  at  least  closed-loop  stability  with  our  controllers. 
Then,  we  assume  that  the  stations  communicate  only  their 
estimation  residuals.  We  will  show  that  when  the  transmis¬ 
sion  noise  intensities  are  small,  sending  estimation  residu¬ 
als  would  be  enough  to  achieve  closed-loop  stabili^  in  our 
sub-optimal  approach.  Our  concluding  remarks  t^pear  in 
the  final  section. 


2  Problem  Statement 


Consider  the  following  decentralized  linear  system  with  two 
stations: 

i(t)  =  Ax{t)  +  Bitt^{t)-¥BiU^{t)+w{t)(2.l) 

z^t)  =  F^«(t)-l-u^(t)  (2.2) 

z*(t)  =  (2.3) 

where  x{t)  €  TV*  is  the  global  state  vector,  v}  (f)  €  11"** 
and  z^  (t)  e.  TV*  are  the  control  and  the  information  vec¬ 
tors  for  the  first  station  and  u*(t)  €  H"**  and  z*(t)  6  TV^ 
are  the  control  and  the  information  vectors  for  the  second 
station.  The  process  noise  and  the  information  noises  are 
denoted  by  tu(f),  v^{t)  and  respectively,  which  are 
all  assum^  to  zero  mean  white  Gausaan  with  intenaty 
matrices  W,  and  V^,  They  are  also  assumed  to  be  mu¬ 
tually  independent  and  independent  of  the  inifial  state.  Note 
that  we  distinguish  between  measurement  and  information, 
simply  because  of  the  fact  that  the  information  vector  for 
a  station  may  also  include  the  transmitted  measurements  of 
the  other  station. 


The  original  objective  is  to  find  v}  =  (z^)  and  u*  = 

u^(z^)  in  order  to  minimize  the  following  cost: 


J=^l^  ^ J  ^x^Qx  +  u^^Riv}  -I-  dt 

(2.4) 

Since  the  stations  in  general  have  access  to  different  infor¬ 
mation,  we  have  a  non-classical  information  pattern.  More¬ 
over,  the  information  pattern  is  not  partially  nested.  That  is, 
the  information  available  to  each  station  is  being  affected  by 
the  control  action  of  the  other  station,  while  there  is  no  way 
for  that  station  to  obtain  any  information  about  those  control 
actions.  Therefore,  in  general,  we  will  have  a  non-convex 
functional  optimization  problem,  the  solutions  of  which  are 
usually  very  difficult  to  find. 


One  possible  sub^iptimal  tqtproach  is  to  solve  two  separate 
centralized  problems.  We  will  discuss  this  approach  in  the 
following  sections.  But  there  are  two  points  that  we  need  to 
mention  now.  As  we  shall  see,  in  many  cases,  we  are  fixing 
the  structure  of  our  controllers  only  based  on  the  central¬ 
ized  results.  Even  though  this  comes  naturally  out  of  our 


lack  of  knowledge  about  the  structure  of  the  decentralized 
controllers,  it  may  well  be  Justified  for  the  case  where  the 
stations  communicate  all  their  measurements  through  low 
noise  channels.  The  other  point  is  our  choice  of  model  for 
the  uncertainty  in  the  transmitted  information.  We  simply 
model  the  received  information  signal  as  the  transmitted  sig¬ 
nal  plus  a  Gaussian  transmission  noise.  While  this  oKxlel  is 
realistic  for  analog  communication  systems,  it  may  not  be 
well  justified  when  digital  commimication  is  used.  Namely, 
in  digital  communication  systems,  the  signal  is  quantized, 
coded  and  sent  through  the  channel.  The  channel  noise  may 
still  be  assumed  to  be  additive  and  Gaussian,  but  sophisti¬ 
cated  modulation  and  coding  schemes  make  it  difficult  to 
assume  a  simple  additive  Gaussian  uncertainQr  for  the  re¬ 
ceived  information  signal.  However,  if  we  try  to  incorpo¬ 
rate  the  quantization  effects  along  with  the  error  probability 
distribution  for  some  good  coding  and  modulation  schemes 
in  order  to  model  the  communication  uncertainties,  we  will 
end  up  with  models  which  could  still  be  approximated,  to 
some  degree,  by  simple  additive  Gaussian  models.  On  the 
other  hand,  ^ce  there  are  already  major  difficulties  in  deal¬ 
ing  with  decentralized  non-classical  information  patterns, 
using  more  complex  models  for  communication  uncertain¬ 
ties  does  not  seem  very  reasonable  at  this  point  Further¬ 
more,  we  believe  the  results  obtained  under  such  a  simpli¬ 
fying  assumption  would  stiU  be  helpful  in  giving  us  insight 
towards  the  true  nature  of  decentralized  controllers. 


3  A  Sub-optimal  Approach 

One  possible  sub-optimal  approach  in  dealing  with  decen- 
ualized  problems  is  to  decompose  them  into  several  central¬ 
ized  problems  in  a  reasonable  fashion.  One  of  our  main  ob¬ 
jectives  in  this  pap»  is  to  investigate  such  an  approach  and 
elaborate  more  on  some  of  the  important  properties  of  the 
controllers  under  various  communication  scenarios  among 
the  stations. 

Consider  the  system  (2.1)  again.  We  would  like  to  de¬ 
sign  the  controls  based  on  two  centralized  LQG  problems. 
Namely,  let  each  station  pretend  that  it  has  access  to  both 
of  the  controls  while  it  only  has  access  to  its  own  informa¬ 
tion.  In  other  words,  the  t-th  station  (i  =  1, 2)  wants  to 
design  =  «i(z*)  and  t4  =  in  order  to  minimize 
the  following  cost* 


-I- -(- (ft  . 

(3.1) 

From  the  well-known  centralized  LQG  results  [1],  the  opti¬ 
mal  controls  can  be  obtained  as: 


J-[-iir3x<(f) 


=  1,2, 
(3.2) 


where  11  is  obtained  from  the  steady-state  control  Riccati 
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equation: 

-iiA  -A^n + n  +  Biiq^Bj)  n-  Q=o , 

(33) 

and  X*  is  the  local  state  estimate  in  the  i-th  station; 

ii(f)=Ax’(f)  -+  Bitij  (t)  +  B2ui(t)  +  Li  {z'it)-H'x'(t)) , 

(3.4) 


The  estimator  gmn  is  obtained  as: 


It  is  straightforward  to  obtain; 

X  =  Ax—BiKiX^—B2K2S^+  W  —  (A  —  BiKi  —  B2K2)x 

+  (B1K1+B2K2)  62  —BiKiei2  +  w  (4.4) 

62  =  (j4  —  L2B^)  62  “  BiKiei2  +  v}  —  L2V*  (4.5) 
612  —  {A  —  BiKi  —  B2K2  —  LiH^)  ei2 

+  {LiH^  -  LzE^)  62  +  (4-6) 


Li  =  PiiHY  {^0  *  =  1.2,  (33) 

where  Pi  is  the  solution  to  the  corresponding  steady-state 
filter  Riccati  equation: 

APi+PiA^-Pi{BY  (V’)"'  H%+W  =  0,  i  =  1,2 . 

(3.6) 

Note  that  the  only  difference  in  the  two  centralized  prob¬ 
lems  comes  from  the  fact  that  the  stations  have  access  to 
different  information,  i.e.  from  the  matrix  H*  and  the  noise 
intensity  matrix  V\ 

After  solving  the  two  centralized  problems,  v\  and  ti|  will 
be  applied  to  the  decentralized  system.  Obviously,  there  is 
no  reason  for  these  controllers  to  be  optimal  for  the  decen¬ 
tralized  system.  Also  they  are  not  guaranteed  to  preserve 
any  level  of  performance  including  even  the  closed-loop  sta¬ 
bility.  However,  in  some  cases,  where  the  stations  are  al¬ 
lowed  to  communicate  some  pieces  of  information  through 
low  noise  channels,  we  would  expect  the  local  stations  to 
generate  very  similar  controllers,  which  in  turn  are  expected 
to  be  very  close  to  the  decentralized  optimal  controllers. 

4  Closed-Loop  Stability 

Achieving  closed-loop  stability  is  one  of  the  most  important 
performance  properties  that  we  would  desire  for  our  con¬ 
trollers.  On  the  other  hand,  the  centralized  LQG  controllers 
will  always  stabihze  the  system  under  some  detectability 
and  stabilizabW'i  conditions.  But  in  general,  there  is  no 
reason  to  guarantee  closed-loop  stability  if  we  wly  the 
same  centralized  controls  to  the  decentralized  system.  In 
this  section,  we  will  investigate  the  closed-loop  stability 
properties  of  our  controllers  in  various  situations,  where  the 
stations  communicate  different  pieces  of  information.  Note 
that  in  some  cases,  based  on  the  available  information  for 
each  station,  we  may  modify  the  estimators  and  hence  devi¬ 
ate  a  little  bit  from  the  original  centralized  LQG  solutions. 
In  such  cases,  we  will  instead  be  looking  at  general  linear 
estimate  linear  feedback  structures. 

In  order  to  analyze  the  dynamics  of  the  closed  loop  system, 
we  define  the  local  estimation  errors  and  the  difference  be- 


tween  the  local  estimates  respectively  as: 

61  (f)  =  x{t)-xHt) 

(4.1) 

62  (t)  =  x(t)-x*(f) 

(4.2) 

ei2(f)  =  x^(i)-®®(f)- 

(4.3) 

Hence,  the  closed-loop  system  dynamics  can  be  written  as 
follows: 


'  i  A—B1K1—B2K2  Biifi  •+••82.^2 
i2  =  0  A-L2H^ 

612]  L  0  LxH^-L2H^ 

-BiKi  1  r  X 1  [/  0  o' 

-BiKi  €2+10  -L2 

A-BiKi-B2K2-LiH^\[ei2\  [o  ii  -82 


4.1  No  Transmission 

Assume  that  each  station  only  has  access  to  its  own  mea¬ 
surements,  i.e.,  there  is  no  communication  between  the  sta¬ 
tions.  In  this  case,  the  closed-loop  dynamics  are  in  the  form 
(4.7)  where  and  are  the  corresponding  measurement 
matrices  for  die  stations,  while  and  v*  simply  denote  the 
measurement  uncertainties. 

Let’s  assume  that  the  statimis  have  the  same  measurement 
characteristics.  Then  it  is  clear  from  (4.7)  that  in  order  to 
have  a  stable  closed-loop  system,  we  need  to  have  stable 
feedback  dynamics  along  with  stable  local  estimators  and 
compensators.  We  conjecture  that  these  stability  properties 
are  sufficient  for  the  closed-loop  stability  even  if  the  sta¬ 
tions  do  not  have  identical  measurements.  But  to  achieve 
such  stability  properties,  we  need  the  global  state  to  be  de¬ 
tectable  from  each  local  station.  This  condition,  however,  is 
a  very  strong  condition  for  a  decentralized  system.  In  most 
decentralized  systems  the  global  state  can  not  be  detectable 
from  all  individual  stations.  Moreover,  even  if  such  a  strong 
condition  is  satisfied,  we  still  do  not  have  any  good  justifi¬ 
cation  for  our  sub-optimal  approach  in  this  case.  There  is 
really  no  reason  to  expect  the  two  centralized  controllers  to 
have  a  good  performance  if  they  are  applied  to  the  decen¬ 
tralized  system. 

43  Control  (Estimate)  Transmission 
In  this  scenario,  the  stations  communicate  only  their  con¬ 
trols.  In  other  words,  each  station  has  acce^  to  its  own 
local  measurements  and  the  transmitted  control  of  the  other 
station.  As  we  have  already  mentioned,  the  communica¬ 
tion  uncertainties  are  simply  modeled  as  additive  Gaussian 
noises.  Also  all  the  communications  are  assumed  to  be  in¬ 
stantaneous.  Therefore,  the  information  available  to  the  first 
station  is: 
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zi  =  Hix  +  vi,  U2  (t)  +  vt2{t),  (4.8) 


while  the  second  station  has  access  to  the  following  infor¬ 
mation: 

Z2  =  H2X  -f  va,  Ui (t)  +  vti (f),  (4.9) 


43  Measurement  TVansmissioii 
Assume  now  that  the  stations  can  communicate  all  their 
measurements.  In  this  case,  the  information  available  to  the 
stations  can  be  expressed  as: 


where  vti  and  vt2  are  the  corresponding  transmission 

\zl] 

Hix  +  til 

noises.  Each  station  now  incorporates  the  received  control 

z  = 

4 

H2X  +  V2  +  V21  _ 

of  the  other  station  in  its  local  estimator.  Namely,  the  local 
estimators  are: 

= 

Hix  +  vi  +V12 

[4\ 

H2X  +  V2 

^Hx  +  v^  (4.17) 
=  +v’,(4.18) 


Batia +-82^42 +  —  (4.10) 

-f  Satia+Xra  (22— 822^) ,  (4.11) 

where: 

Li  =  PiE^Vr^  (4.12) 

(4.13) 

and  Pi  and  P2  are  still  the  solutions  to  the  corresponding 
Riccad  equadons.  Note  that  Pi  and  P2  are  not  the  local  es- 
dmadon  error  covariances  anymore.  The  following  controls 
are  now  applied  to  the  decentralized  system: 


where  viaCO  and  V2i{t)  are  independent  transmission 
noises,  which  are  also  assumed  to  be  independent  of  other 
underlying  uncertaindes  in  the  system.  Note  that  in  this  sce¬ 
nario,  both  stadons  have  the  same  informadon  matrix  8. 
Therefore,  there  can  not  be  any  decentralized  fixed  mode  in 
this  case. 


Similar  to  the  previous  cases,  we  solve  two  separate  central¬ 
ized  LQG  problems.  For  the  first  stadon  we  get: 


1  _  r (f)  1  _  r  -Kix^it) 


,  (4.19) 


ui(t)  =  (4.14) 

Mt)  =  -R2BjTix^{t)  =  -K2x\t),  (4.15) 


where  11  is  the  solution  to  the  corresponding  steady-state 
control  Riccati  equation.  It  is  straightforward  to  obtain  the 
dynamics  of  the  closed-loop  system: 


“  X  ' 

'A-B1K1-B2K2  BiKi  B2K2  ' 

’  X  ' 

Cl 

= 

0  A^BiBi  0 

ei 

.^2. 

0  0  A-L2H2^ 

.®2. 

■/ 

0 

0  ■ 

■  0  ■ 

■  0  ■ 

/ 

-Li 

0 

-Bi 

va 

-Pi 

0 

I 

0 

-Li 

0 

.(4.16) 


It  is  clear  that  the  closed-loop  system  can  be  stabilized  if  the 
system  is  stabilizable  using  boA  stations  and  is  detectable 
from  each  individual  station.  As  we  mentioned  earlier,  this 
latter  condition  can  not  be  satisfied  in  many  decentralized 
systems.  Also  even  if  the  control  transmission  is  noiseless, 
there  is  still  no  reason  to  believe  that  these  centralized  con¬ 
trollers  are,  in  any  sense,  close  to  the  optimal  decentralized 
controllers. 


where: 


f  1  =A£^  +  Bml  +  Biul  +  Li  {z^ -Hx^)  (4.20) 

(4.21) 

APi  +  PxA'^-PiH'^{V^)~^HPi  +W=Q  (4.22) 
"=[!;]■  ''‘= 


Vi  0 

0  Fa  +  Vji 


(4.23) 


and  for  the  second  station: 


[«?(*)]  _r--Rr'5rni2(i) 


,  (4.24) 


where: 


x^  =  A3?  +  Biu\  +  B2ii^  +  L2{z^-Hx^)  (4.25) 

(4.26) 

APa  +  P2A^-P2H'^{V^)~^HP2  +  «/■=  0  (4.27) 


V1  +  VI2  0  ■ 

0  V2  ' 


(4.28) 


Note  that  communicating  the  local  estimates  is  actually 
equivalent  to  communicating  the  controls.  This  is  because 
we  have  a  cooperative  structure.  That  is,  each  station  can  be 
informed  of  the  control  strategy  and  specifically  the  estima¬ 
tor  and  feedback  gains  of  the  other  station  a  priori.  There¬ 
fore,  the  stations  can  simply  calculate  either  the  control  or 
the  estimate  upon  receiving  the  other. 

Fmally,  note  that  we  have  incorporated  the  transmitted  con¬ 
trols  in  the  local  estimators  in  a  rather  straightforward  man¬ 
ner  .  Whether  there  are  better  ways  to  incorporate  this  new 
information  is  a  problem  to  be  ad^essed. 


In  this  scenario,  we  have  a  very  good  justification  for  our 
sub-optimal  approach.  Namely,  if  the  transmissions  are 
noiseless,  the  two  centralized  problems  would  be  identical. 
Therefore,  we  would  expect  our  controllers  to  be  the  opti¬ 
mal  decentralized  controllers,  which  would  preserve  all  the 
desired  properties  including  the  closed-loop  stability.  Fur¬ 
thermore,  if  the  transmissions  are  noisy  but  the  transmission 
noise  intensities  are  small,  we  would  still  expect  the  con¬ 
trollers  to  be  close  to  the  optimal  stabilizing  decentralized 
controllers.  In  other  words,  we  would  not  expect  any  dras¬ 
tic  change  in  the  behavior  of  the  controlled  decentralized 
system  upon  introducing  some  small  transmission  noise. 
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We  shall  now  look  at  the  closed-loop  stability  properties.  It 
is  easy  to  obtain  the  following  closed-loop  system  dynamics 
which  is  valid  for  any  linear  estimate  linear  feedback  struc¬ 
ture: 


A—B1K1—B2K2  B1K1+B2K2 

0  A-L2H  (4.29) 

0  {Li-L2)H 

_  nr  ....  T 


-BiKi 

X 

■70  o' 

’  VJ  " 

-BiKi 

62 

j  0  -L2 

A-B1K1-B2K2-L1H 

0  Li  — L2  ^ 

We  notice  that  the  closed-loop  system  matrix  has  an  inter¬ 
esting  structure.  The  first  diagonal  block  matrix  is  simply 
the  matrix  associated  with  the  feedback  dynamics,  which 
could  be  stabilized  if  the  system  is  stabilizable  a-ang  both 
control  stations.  The  second  diagonal  block  matrix  could 
also  be  made  stable  under  a  simple  detectability  condition. 
That  is,  if  the  global  state  is  detectable  using  both  stations. 
Note  that  this  is  a  much  weaker  condition  than  detectabil¬ 
ity  from  each  individual  station,  which  would  be  required  if 
the  stations  did  not  communicate  their  measurements.  The 
third  diagonal  block  matrix,  however,  is  die  matrix  corre¬ 
sponding  to  the  compensator  dynamics,  which  may  not  be 
stable. 


their  measurements  through  noisy  channels,  i.e.: 


zHt)  = 


z\t)  ^ 


_  ■ 

L^wJ 

_  r 

Un). 

H2X{t)+V2(t)+V2l{t) 

Hix{t)  +  Vi(f)  -t-  012(4) 
H2x{t)  -1-02(4) 


(4.30) 

(4.31) 


Also  assume  that  the  stations  communicate  their  controls. 
For  a  little  more  generality,  lets  assume  that  die  communi¬ 
cation  uncertainties  on  the  controls  are  modeled  by  an  ad¬ 
ditive  Gaussian  uncertainty  along  with  a  scale-factor  error. 
Namely,  the  first  station  also  has  access  to  (/  -I-  A2W  (*)  + 
012(4).  while  the  second  station  receives  (J  +  Ai)u^(4)  + 
vti{t).  lyansmission  noises  oti(4)  and  0*2(4)  are  assumed 
to  be  independent  of  each  other  and  also  independent  of  all 
other  uncertainties  in  the  system. 


Similar  to  Section  4.2,  each  station  incorporates  the  trans¬ 
mitted  control  of  the  other  station  in  its  local  estimator.  That 
is,  the  estimators  are  constructed  in  the  following  manner: 

41  =  Ax^-f Biu^-kSa  (•1^+  ^2) 520*2+  5i (z^—Hx  ) 

(4.32) 

f  2  ss  Ax^+Bi  (/+  Ai )  t*^+  Bi  0*1  +52t*^+  L2  -S'®  )  > 

(4.33) 

where: 

Lx  =  PiF^(V')  (4.34) 


This  is  a  significant  result  Let’s  assume  that  the  transmis¬ 
sion  noise  intensities  are  very  small.  Then  the  estimator 
gains  would  be  almost  die  same  and  the  closed-loop  system 
matrix  would  be  very  close  to  a  block  upper-triangular  ma¬ 
trix.  We  can  see  that  if  the  compensator  is  unstable  (which 
might  be  the  case  in  many  systems,  especially  those  with  a 
non-minimum  phase  structure),  the  closed-loop  system  will 
become  unstable  because  of  the  unstable  dynamics  govern¬ 
ing  the  difference  between  the  estimates  of  the  two  local  es¬ 
timators.  Actually,  even  when  the  transmissions  are  noise¬ 
less,  there  is  still  an  unstable  subsystem  corresponding  to 
ei2.  This  does  not  comply  with  our  initial  expectation.  Note 
that  there  is  no  forcing  input  for  this  unstable  subsystem,  but 
any  small  nonzero  ei2  could  propagate  to  infinity!  Such  a 
nonzero  difference  between  the  local  estimates,  which  could 
be  generated  from  any  difference  in  the  initial  conditions  of 
the  local  estimators,  round  off  errors,  etc.,  would  again  in¬ 
duce  a  non-classical  information  pattern. 

4.4  Measurement  and  Control  Transmission 
We  saw  that  if  the  stations  communicate  only  their  measure¬ 
ments,  our  specific  sub-optimal  controllers  may  not  be  able 
to  stabilize  the  closed-loop  system,  even  though  they  will 
yield  the  centralized  optimal  stabilizing  controllers,  in  the 
limit,  when  the  transmission  noise  intensities  go  to  zero.  In 
this  section,  we  will  see  how  transmitting  the  controls  along 
widi  the  measurements  will  help  us  stabilize  the  closed-loop 
system,  using  a  similar  sub-optimal  approach. 

As  in  the  previous  ca.se,  assume  that  the  stations  transmit 
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L2  =  P2H’'(V*)"'  (4.35) 

and  Pi  and  P2  are  obtained  from  the  same  Riccati  ^nations 
as  before.  Note  that  Pi  and  P2  are  no  longer  the  estimation 
error  covariances.  Using  the  same  definitions  for  the  error 
variables  d  (4)  and  62(4),  the  closed-loop  dynamics  may  be 
written  as: 


’x~ 

'A-B1K1-B2K2  BiKi  B2K2  ■ 

‘x' 

61 

-B2K2A2  A-LiH  B2K2A2 

61 
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-BiKiAi  BiKiAi  A-L2H 

■7  0 

0  ■ 

‘  w' 

■  0  ■ 

I  -Li 

0 

-B2 

Wt2 

-Bi 

I  0 

-L2 

y. 

0 

As  we  can  see,  when  the  scale-factor  errors  Ai  and  A2  are 
small,  the  closed-loop  system  matrix  is  nearly  block  upper- 
triangular.  The  first  diagonal  block  matrix  can  be  made  sta¬ 
ble  if  the  system  is  stabilizable  using  both  stations.  The  sec¬ 
ond  and  the  third  diagonal  block  matrices  can  also  be  made 
stable  if  (A,  H)  is  detectable. 

We  conclude  that  when  the  stations  communicate  their  con¬ 
trols  as  well  as  their  measurements,  our  sub-optimal  ap¬ 
proach  will  at  least  yield  a  stable  closed-loop  system,  even 
if  there  is  small  scale-factor  errors  on  the  control  transmis¬ 
sions. 

4.5  Estimation  Residuals  Dransmission 

So  far,  we  have  seen  that  in  order  to  design  a  set  of  sub- 

optimal  stabilizing  controllers  by  solving  two  centralized 


problems  for  a  two-station  decentralized  system  and  under 
some  reasonable  stabilizability  and  detectd)ility  assump¬ 
tions,  the  stations  need  to  communicate  both  their  measure¬ 
ments  and  controls. 

In  this  section,  we  investigate  the  case  where  the  sta¬ 
tions  communicate  their  estimation  residuals  instead  of  their 
measurements  and  controls.  In  other  words,  the  first  station 
has  access  to  the  following  information: 

Z\  =  ffl®  +  Vl,  (z2  —  H2X^)  +  Ut2>  (4.37) 
while  the  information  available  to  the  second  station  is: 

Z2  =  H2X  -I-  V2,  {zi  —  Six})  -1-  Wti>  (4.38) 

where  vti  and  va  denote  the  transmission  noises.  In  the 
previous  cases,  the  linear  structure  of  the  estimators  and  the 
controllers  naturally  came  out  of  the  two  centralized  optimal 
control  problems.  In  this  case,  however,  we  will  impose 
a  linear  structure  on  our  estimation  and  control  such  that 
each  station  wiU  linearly  incorporate  the  noisy  residual  of 
the  other  station,  i.e.,  for  the  first  station,  we  have: 


lii(f) 

(4.39) 

x'l  = 

-Hix^) 

(4.40) 

while  for 

the  second  station,  we  get: 

t4{f) 

II 

1 

H> 

II 

1 

2X*(t) 

(4.41) 

f  2  — 

Ax^  4-  tif  -h  -B2t^2  *^1  (^1 

-Hix^) 

+-f^2  (2^2  ""-02®^)  +  iiVtl, 

(4.42) 

The  gains  may  now  be  obtained  based  on  some  optimal¬ 
ity  criteria.  Note  that  when  the  transmission  noises  vti  anti 
vt2  are  zero,  the  local  estimators  will  have  exactly  the  same 
structure.  Therefore,  we  expect  the  estimators  to  have  the 
same  gains  in  the  noiseless  transmission  case,  regardless  of 
how  the  gains  are  obtained.  Also  note  that  each  station  has 
linearly  incorporated  the  received  estimation  residual  of  the 
other  station.  Even  though  this  simplifies  the  problem,  it  is 
not  necessarily  the  best  way  of  incorporating  this  new  piece 
of  information. 

Similarly  to  the  previous  cases,  it  is  straightforward  to  ob¬ 
tain  the  closed-loop  dynamics  as  the  following: 

■  X  1  \ A- BiKi- B2K2  Bi Ki  -H B2K2 

€2  =  0  A-LIH1-LIH2  (4.43) 
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As  we  can  see,  when  the  transmission  noise  intensities  are 
STTii^ii,  the  closed-loop  system  matrix  will  be  close  to  a 
block  upper-triangular  matrix,  which  can  easily  be  stabi¬ 
lized  when  the  system  is  stabilizable  using  both  stations 
and  (a,  [  ]  ^)  is  detectable.  This  shows  us  that 

in  some  ^se,  the  estimation  residuals  are  more  valuable 
than  the  measurements  and  communicating  the  residuals  is 
enough  to  stabilize  the  system  by  solving  two  centralized 
problems. 

5  Concluding  Remarks 

A  two  station  decentralized  LQG  problem  was  formulated, 
where  the  local  controllers  had  to  be  designed  based  on 
some  local  information  in  order  to  minimize  a  single  com¬ 
mon  cost  This  problem  generally  has  a  non-classical  in¬ 
formation  pattern  and  the  optimal  controls  are  usually  un¬ 
known.  One  of  the  first  possible  sub-optimal  approaches  is 
to  decompose  the  problem  into  separate  centi^zed  prob¬ 
lems.  In  this  paper,  we  investigated  such  an  approach 
for  different  communication  scenarios  between  the  stations, 
namely,  when  the  stations  commuiricate  tiieir  controls,  their 
measurements  or  both,  or  their  estimation  residuals. 

We  showed  that  even  though  our  approach  is  quite  reason¬ 
able  for  the  case  where  the  stations  communicate  all  their 
measurements,  it  may  fail  to  stabilize  the  closed-loop  sys¬ 
tem  as  soon  as  the  compensator  is  unstable.  Then,  we 
showed  how  this  difficulty  can  be  removed  if  die  stations 
either  communicate  both  Aeir  measurements  and  their  con¬ 
trols  or  communicate  their  estimation  residuals.  We  should 
also  mention  that  a  similar  problem  can  be  formulated  for 
discrete-time  systems  and  similar  results  can  be  obtained. 

All  these  results  show  some  of  the  fundamental  differences 
between  the  centralized  and  the  decentralized  structures. 
Moreover,  we  have  tried  to  elaborate  on  the  role  of  com¬ 
munication  among  the  stations  and  the  corresponding  un¬ 
certainties.  While  many  new  applications  for  spatially  dis¬ 
tributed  dynamic  systems  are  emerging,  there  are  still  major 
difficulties  that  need  to  be  addressed. 
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SUMMARY 

A  fault  detection  and  identification  algorithm  is  determined  from  a  generalization  of  the  least-squares 
derivation  of  the  Kalman  filter.  The  objective  of  the  filter  is  to  monitor  a  single  fault  called  the  target  fault 
and  block  other  faults  which  are  called  nuisance  faults.  The  filter  is  derived  from  solving  a  min-max  problem 
with  a  generalized  least-squares  cost  criterion  which  explicitly  makes  the  residual  sensitive  to  the  target  fault, 
but  insensitive  to  the  nuisance  faults.  It  is  shown  that  this  filter  approximates  the  properties  of  the  classical 
fault  detection  filter  such  that  in  the  limit  where  the  weighting  on  the  nuisance  faults  is  zero,  the  generalized 
least-squares  fault  detection  filter  becomes  equivalent  to  the  unknown  input  observer  where  there  exists 
a  reduced-order  filter.  Filter  designs  can  be  obtained  for  both  linear  time-invariant  and  time-varying 
systems.  Copyright  ©  2000  John  Wiley  &  Sons,  Ltd. 

KEY  WORDS:  fault  detection  and  identification;  unknown  input  observer;  worst  case  design;  time-varying 
system 


1.  INTRODUCTION 

Any  system  under  automatic  control  demands  a  high  degree  of  system  reliability.  This  requires 
a  health  monitoring  system  capable  of  detecting  any  plant,  actuator  and  sensor  fault  as  it  occurs 
and  identifying  the  faulty  component.  One  approach,  analytical  redundancy  which  reduces  the 
need  for  hardware  redundancy,  uses  the  modelled  dynamic  relationship  between  system  inputs 
and  measured  system  outputs  to  form  a  residual  process  used  for  detecting  and  identifying  faults. 
A  popular  approach  to  analytical  redundancy  is  the  unknown  input  observer  [1]  which  divides 
the  faults  into  two  groups:  a  single-target  fault  and  possibly  several  nuisance  faults.  The  nuisance 
faults  are  placed  in  an  invariant  subspace  which  is  unobservable  to  the  residual.  Recently, 
approximate  unknown  input  observers  have  been  developed  which  have  improved  robustness  to 
uncertainties  and  applicable  to  time-varying  systems  [2,3], 

In  this  paper,  a  generalized  least-squares  fault  detection  filter,  motivated  by  Chung  and  Speyer 
[2]  and  Bryson  and  Ho  [4],  is  presented.  A  new  least-squares  problem  with  an  indefinite  cost 
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criterion  is  formulated  as  a  min-max  problem  by  generalizing  the  least-squares  derivation  of  the 
Kalman  filter  [4]  and  allowing  the  explicit  dependence  on  the  target  fault  which  is  not  presented 
in  Reference  [2].  Since  the  filter  is  derived  similarly  to  Reference  [2],  many  properties  obtained  in 
Reference  [2]  also  apply  to  this  filter.  However,  some  new  important  properties  are  given.  For 
example,  since  the  target  fault  direction  is  now  explicitly  in  the  filter  gain  calculation,  a  mecha¬ 
nism  is  provided  which  enhances  the  sensitivity  of  the  filter  to  the  target  fault.  Furthermore,  the 
projector,  which  annihilates  the  residual  direction  associated  with  the  nuisance  faults  and  is 
assumed  in  the  problem  formulation  of  Reference  [2],  is  not  required  in  the  derivation  of  this 
filter.  Finally,  it  is  shown  that  this  filter  completely  blocks  the  nuisance  faults  in  the  limit  where 
the  weighting  on  the  nuisance  faults  is  zero.  For  time-invariant  systems,  the  nuisance  faults  are 
placed  in  a  minimal  (C,  y4)-unobservability  subspace,  and  the  generalized  least-squares  fault 
detection  filter  becomes  equivalent  to  the  unknown  input  observer.  For  time-varying  systems,  the 
nuisance  faults  are  placed  in  a  similar  invariant  subspace,  and  the  generalized  least-squares  fault 
detection  filter  extends  the  unknown  input  observer  to  the  time-varying  case.  In  the  limit, 
a  reduced-order  filter  is  derived  for  time-varying  systems. 

The  problem  is  formulated  in  Section  2  and  its  solution  is  derived  in  Section  3  [2,4].  In  Section 
4,  the  filter  is  derived  in  the  limit  [2,5].  In  Section  5,  it  is  shown  that,  in  the  limit,  the  nuisance 
faults  are  placed  in  an  invariant  subspace.  In  Section  6,  the  reduced-order  filter  is  derived  in  the 
limit.  In  Section  7,  numerical  examples  are  given. 


2.  PROBLEM  FORMULATION 

Consider  a  linear,  observable  system  with  two  failure  modes  [1,2] 

X  =  Ax  +  Bu  +  FiHi  +  F2H2  (1^) 

y  =  Cx  -f  u 

where  u  is  the  control  input,  y  is  the  measurement,  v  is  the  sensor  noise,  Hi  is  the  target  fault,  and 
fi2  is  the  nuisance  fault.  All  system  variables  belong  to  real  vector  spaces,  xe  and 

System  matrices  A,  B,  C,  Ft  and  are  time-varying  and  continuously  differentiable.  The  failure 
modes,  and  fi2,  model  the  time-varying  amplitude  of  the  failure  while  the  failure  signatures,  Fi 
and  Fi,  model  the  directional  che- ucteristics  of  a  failure.  Assume  Fi  and  Fj  are  monic  so  that 

Fi  T^OandFj  ^^OimplyfiPi  7^  0  and  F2//2  #  0,  respectively.  In  References  [1,2],  it  is  shown  that 

this  model,  used  to  determine  the  fault  detection  filter,  represents  actuator,  sensor  and  plant 
faults.  There  are  two  assumptions  about  the  system  (1)  that  are  needed  in  order  to  have  a  well- 
conditioned  unknown  input  observer.  Assumption  2.1  ensures  that  the  target  fault  can  be  isolated 
from  the  nuisance  fault  [1,2].  The  output  separability  test  is  discussed  in  Remark  1  of  Section  5. 
Assumption  2.2.  ensures  a  non-zero  residual  in  steady-state  when  the  target  fault  occurs  for 
time-invariant  systems  [3,6]. 

Assumption  2.1. 

Fi  and  Fi  are  output  separable. 

Assumption  2.2. 

For  time-invariant  systems,  (C,  A,  F,)  does  not  have  invariant  zero  at  origin. 
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The  objective  of  blocking  the  nuisance  fault  while  detecting  the  target  fault  can  be  achieved  by 
solving  the  following  min-max  problem: 

min  max  max  ^  I*  (||;iille-' —  |l//2llTQr' ~  11^  “  ~ 

Hi  til  jc(to)  2  J(^  ^ 

subject  to  (la).  Note  that,  without  the  minimization  with  respect  to  fix,  (2)  reduces  to  the  standard 
least-squares  derivation  of  the  Kalman  filter  [4].  t  is  the  current  time  and  y  is  assumed  given.  Qi, 
Q2,  V  and  Ho  are  positive  definite,  y  is  a  non-negative  scalar.  Note  that  Qi,  Q2>  Ho  and  y  are 
design  parameters  to  be  chosen  while  V  may  be  physically  related  to  the  power  spectral  density  of 
the  sensor  noise  because  of  (lb)  [4].  The  interpretation  of  the  min-max  problem  is  the  following. 
Let  ^1,  fii  and  x*{to)  be  the  optimal  strategies  for  fix,  fi2  and  x(ro)j  respectively.  Then,  x*(t|  Yi),  the 
X  associated  with  /if,  /if  and  x*(fo),  is  the  optimal  trajectory  for  x  where  t  €  [to,  t]  and  given  the 
measurement  history  T,  =  {y(T)|to  <  t  <  t}.  Since  fix  maximizes  y  -Cx  and  fiz  minimizes 
y  —  Cx,  y  —  Cx*  is  made  primarily  sensitive  to  fix  and  minimally  sensitive  to  fi2  -  However,  since 
X*  is  the  smoothed  estimate  of  the  state,  a  filtered  estimate  of  the  state,  called  x,  is  needed  for 
implementation.  From  the  boundary  condition  in  Section  3,  at  the  current  time  t,  x*(t|  Yi)  =  x(t). 
Therefore,  y  -  Cx  is  primarily  sensitive  to  the  target  fault  and  minimally  sensitive  to  the  nuisance 
fault.  Note  that  when  Q\  is  larger,  y  -  Cx  is  more  sensitive  to  the  target  fault.  When  y  is  smaller, 
y  —  Cx  is  less  sensitive  to  the  nuisance  fault.  In  Reference  [2],  the  differential  game  blocks  the 
nuisance  fault,  but  does  not  enhance  the  sensitivity  to  the  target  fault.  In  Section  5,  it  is  shown  that 
the  filter  completely  blocks  the  nuisance  fault  when  y  is  zero  by  placing  it  into  an  invariant 
subspace,  called  Ker  S.  Therefore,  the  residual  used  for  detecting  the  target  fault  is 

r  =  H(y  -  Cx)  (3) 

where  x,  the  filtered  estimate  of  the  state,  is  given  in  Section  3  and 

Ker  ^  -  C  KerS,  S  I  -  C  KerS[(CKerS)^CKer5]“HCKer5)^  (4) 
KerS  is  given  and  discussed  in  Sections  4  and  5. 

3.  SOLUTION 

In  this  section,  the  min-max  problem  given  by  (2)  is  solved  [2,4].  The  variational  Hamiltonian  of 
the  problem  is 


^  =  •z(ll/'illcr‘  -  ll/'2ll?Q-  -  ll>'  -  Cx\\^-^)  +  X^{Ax  +  Bu  +  Fiiii  +  F2H2) 


where  2  6^"  is  a  continuously  differentiable  Lagrange  multiplier.  The  first-order  necessary 
conditions  [4]  imply  that  the  optimal  strategies  for  /ii,  fi2  and  the  dynamics  for  X  are 

y 


Copyright  ©  2000  John  Wiley  &  Sons,  Ltd. 


Int.  J.  Adapt,  Control  Signal  Process.  2000;  14:747-757 


750 


R.  H.  CHEN  AND  J.  L.  SPEYER 


with  boundary  conditions 


A(to)  =  no[x*(£o)-xo].  m  =  0 


(5) 


By  substituting  //*  and  fi*  into  (la),  the  two-point  boundary  value  problem  requires  the  solution  to 


A 

CTy-ic 


jFzQifi  -  FiQiFn  .  r  fu 

JL'lJ 


(6) 


with  boundary  conditions  (5).  The  form  of  (5)  suggests  that 


A  =  n(x*  -  x)  (7) 

where  n(to)  =  Ho,  x(to)  =  Xq  and  x  is  an  intermediate  state.  By  differentiating  (7),  using  (6), 
adding  and  subtracting  Il^lx  and  C^F“'Cx,  the  following  dynamic  filter  structure  results: 

ni  =  n^x  -I-  nJ5u  +  C'^V~'(y-  C£),  x(to)  =  Xo  (8) 

-n  =  nA  +  A'^n  +  n^^F2Q2Fj-FiQiFj^n-c'^v-^c,  n(to)  =  no  (9) 

Since  x*  =  x  at  current  time  t  (5),  the  generalized  least-squares  fault  detection  filter  is  (8).  Note 
that  (8)  is  used  by  the  residual  (3)  to  detect  the  target  fault. 


4.  LIMITING  CASE 

In  this  section,  the  min-max  problem  (2)  is  solved  in  the  limit  where  y  is  zero  [2,5].  When  y  is  zero, 
there  is  no  constraint  on  fii  to  minimize  y  —  Cx.  Therefore,  the  nuisance  fault  is  completely 
blocked  from  the  residual  which  is  shown  in  Section  5. 

In  the  limit,  the  min-max  problem  (2)  becomes 

1  1 

min  max  max-  (||/ii||Q-.  -  ||y  -  Cx||^-.)  dr  --  ||x(to)  -  Xo||n.  (10) 

Ml  Ml  Jc(ro)  Z  J  Z 

This  problem  is  singular  with  respect  to  /Z2.  Therefore,  the  Goh  transformation  [5]  is  used  to  form 
a  non-singular  problem.  Let 


<^i(t)  =  Hiis) ds,  0Ci=x-F2<f>i 

Jh 

By  differentiating  cci  and  using  (la), 

di  =  Aai  +  Bu  +  FiHi  +  Bi^i 


(11) 
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where  Bi  =  AF2  —  F2*  By  substituting  oli  into  (10),  the  new  min-max  problem  is 

min  max  max  ^  f  [||/iil|Q-«  —  \\<l>i\\h,rv-^CFi  ““  lly  CctiWv-^  +  (y  —  CaifV  ^CF20i 

Mi  01  3,(r^)  2 

+  (l>]FjC'^V-\y  -  Cai)]  dr  - 1  ||ai(to^)  +  (12) 

subject  to  (11).  If  FIC^V~^CF2  fails  to  be  positive  definite,  (12)  is  still  a  singular  problem  with 
respect  to  0i.  Then,  the  Goh  transformation  has  to  be  used  until  the  problem  becomes 
non-singular.  If  =  0,  let 

02(t)  =  [  0i(s)  ds,  ttj  =  aj  -  Bi(f>2 

Jlo 

Then,  d2  =  Aa2  +  Bu  +  FifXi  +  B2<i>2  where  B2  =  ABi  —  Bi.  If  f|C’’^F”*Cf2  ^  0.  the  Goh 
transformation  is  applied  only  on  the  singular  part  [6].  The  transformation  process  stops  if  the 
weighting  on  <t>2,  *CBi,  is  positive  definite.  Otherwise,  continue  the  transformation  until 

there  exists  B^  such  that  the  weighting  on  ‘CB*- 1,  is  positive  definite.  Then,  in  the 

limit,  the  min-max  problem  (2)  becomes 

min  max  max  ^  (*  [lbi||Q->  —  .ct'-'cs,.,  -  111'  —  C«(=llv-'  +  Cv  -  V~^CBk-i<t>k 

-f-  <j>jB^-iC^V~^{y  —  Can)]  dt  —  -  l|aji(to )  -t-  B^(to )  —  .Xolln„  (13) 

subject  to  Uk  =  AUk  +  Bu  +  F iHi  +  Bii(j)k  where  B  =  [f 2  Bi  B2  •••  Bk-i2  and  ^  = 
[<^T  ^2  4>J.Y-  T'he  min-max  problem  (13)  can  be  solved  similarly  to  (2).  Therefore,  the 

derivation  [6]  is  not  repeated  here.  The  limiting  generalized  least-squares  fault  detection  filter  is 

Sx  =  SAx  -f  SBu  +  [SB*(Bj-iC'^K-‘CB*-i)"‘Bl-iC'^K-‘  -H  C'^H'^K-‘H](y  -  Cx)  (14) 
where 

-S  =  SA  +  A^S  +  5[B*(Bj-iC^I^-‘CB*-,)-‘Bj  -  FrQiFjjS  -  C^R^V-^HC  (15) 

R  =  I-  CBk-i{Bj-  ,C^F- ‘CB,_  1)- »Bj- ‘  and  I  =  X  -  B*(Bj- ‘CB*- ,)- »Bj- 1 
C^V~  ‘C  subject  to  x(to )  =  Xo  and  S{t$)  =  Do  -  noB(B'^noB)~‘B'^no.  However,  (14)  cannot 
be  used  because  S  has  a  null  space  which  is  shown  in  Theorem  4.1.  Therefore,  a  reduced-order 
filter  for  (14)  is  derived  in  Section  6. 

Theorem  4.1. 

S[B,-,  B,_2  -B,  f2]  =  0. 

Proof.  The  proof  is  similar  to  Reference  [2]  and  can  be  found  in  Reference  [6].  □ 


Copyright  ©  2000  John  Wiley  &  Sons,  Ltd 


Int.  J.  Adapt.  Control  Signal  Process.  2000;  14:747-757 


752 


R.  H.  CHEN  AND  J.  L.  SPEYER 


5.  PROPERTIES  OF  THE  NULL  SPACE  OF  S 

In  this  section,  some  properties  of  the  null  space  of  S  are  given.  It  is  shown  that  the  null  space  of 
S  is  equivalent  to  the  minimal  (C,  A)  —  unobservability  subspace  for  time-invariant  systems  and 
a  similar  invariant  subspace  for  time-varying  systems.  Therefore,  the  limiting  generalized  least- 
squares  fault  detection  filter  is  equivalent  to  the  unknown  input  observer  and  extends  it  to  the 
time- varying  case.  The  minimal  (C,  /4)-unobservability  subspace  is  a  subspace  which  is  (A  —  LC)- 
invariant  and  unobservable  with  respect  to  (SC,  A  —  LC)  for  some  filter  gain  L  and  projector 
H  [1].  One  method  for  computing  the  minimal  (C,  X)-unobservability  subspace  of  F^,  called 
^2  here,  is  ^2  =  1^2  ©  ^"2  [1]  where  1^2  =  [B*.  i  Bt_2  •••  Bj  f  2]  is  the  minimal  (C,  ^)-invari- 
ant  subspace  of  F2  and  'f'2  is  the  subspace  spanned  by  the  invariant  zero  directions  of  (C,  A,  F^). 
Note  that  the  associated  H  is 

KerH  =  CBt_i,  =  /  -  CB*_i[(CB»_,rCBt_i]"HCB*_if  (16) 
Note  that  Ker  H  =  Ker  R. 

Theorem  5.1  shows  that  the  null  space  of  S  is  a  (C,  X)-invariant  subspace.  Theorem  5.2  shows 
that  the  null  space  of  S  is  contained  in  the  unobservable  subspace  of  (HC,  A  —  LC). 

Theorem  5.1. 

Ker  S  is  a  (C,  i4)-invariant  subspace. 

Proof.  The  dynamic  equation  of  the  error,  e  =  x  -  x,  in  the  absence  of  the  target  fault  and 
sensor  noise  can  be  obtained  by  using  (1)  and  (14): 

Se  =  [S^  +  SBt(Bj_iC'^F“‘CBi_i)'‘Bj-,C'^F"‘C  -I-  C'^5'^F“‘f?C]e 

because  SF2  =  0.  By  adding  $e  to  both  sides  and  using  (15), 

^(Se)  =  -  {[A  -Bt(Bl_,C^F-‘CB*_,)’‘Bj_,CV-‘C]^ 

QT 

+  Sl-FiQiFj  +  B*(Bj-iC^F-iCB»-i)"‘Bj]}Se  (17) 

If  the  error  initially  lies  in  Ker  S,  (17)  implies  that  the  error  will  never  leave  Ker  S.  Therefore, 
Ker  S  is  a  (C,  y4)-invariant  subspace.  □ 

Theorem  5.2. 

Ker  S  is  contained  in  the  unobservable  subspace  of  {HC,  A  —  LC). 

Proof.  Let  f  6  Ker  S.  By  multiplying  (15)  by  C  from  the  left  and  C  from  the  right, 

^  (C^SO  =  CC^R^V-^RC^  =  0 
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Then,  HCt^  =  0  because  HC^  =  0  and  Ker  H  =  Ker  H.  From  Theorem  5.1,  Ker  S  is  a 
(C,  i4)-invariant  subspace.  Therefore,  Ker  S  is  contained  in  the  unobservable  subspace  of 

{Sc,  A-LQ.  □ 

From  Theorem  4.1,  CKerS2CB*_i.  From  Theorem  5.2,  CKerSsCBj-i.  Therefore, 
C  Ker  S  =  CB^-i  and  S  (4)  is  equivalent  to  H  (16).  Note  that  (16)  is  a  better  way  to  form  S 
which  is  used  by  the  residual  (3)  because  it  does  not  require  the  solution  to  the  limiting  Riccati 
Equation  (15). 

For  time-invariant  systems,  it  is  important  to  discuss  the  invariant  zero  directions  when 
designing  the  fault  detection  filter.  The  invariant  zeros  of  (C,  A,  Fz)  will  become  part  of  the 
eigenvalues  of  the  filter  if  their  associated  invariant  zero  directions  are  not  included  in  the 
invariant  subspace  of  Fj  [!]•  Fro™  Reference  [3,6],  the  null  space  of  S  includes  all  the  invariant 
zero  directions  if  the  nuisance  fault  direction  is  modified  to  the  invariant  zero  directions. 
Therefore,  the  invariant  zeros  will  not  become  part  of  the  filter  eigenvalues.  From  Theorem  4.1 
and  modified  nuisance  fault  direction,  the  null  space  of  S  contains  the  minimal  (C,  yl)-unobserva- 
bility  subspace  of  Fj.  By  combining  with  Theorem  5.2,  the  null  space  of  S  is  equivalent  to  the 
minimal  (C,  i4)-unobservability  subspace  of  Fj,  and  the  limiting  generalized  least-squares  fault 
detection  filter  is  equivalent  to  the  unknown  input  observer.  Note  that  the  invariant  zero  and 
minimal  (C,  i4)-unobser.vability  subspace  are  only  defined  for  time-invariant  systems.  For  time- 
varying  systems.  Theorems  4.1,  5.1  and  5.2  imply  that  the  null  space  of  S  is  a  similar  invariant 
subspace. 

Remark  1. 

In  order  to  detect  the  target  fault,  Fi  cannot  intersect  the  null  space  of  S  which  is  unobservable 
to  the  residual.  If  it  does,  the  target  fault  will  be  difficult  or  impossible  to  detect  even  though  the 
filter  can  still  be  derived  by  solving  the  min-max  problem.  If  Fi  does  not  intersect  the  null  space  of 
S,  Fi  and  Fj  are  called  output  separable  [1],  and  the  output  separability  test  can  be  stated  as 
CBk-inCSi-i  =  0  where  Bj-i  is  the  Goh  transformation  of  Fj. 

6.  RFDUCED-ORDER  FILTER 

In  this  section,  the  reduced-order  filter  is  derived  for  the  limiting  generalized  least-squares  fault 
detection  filter  (14).  The  reduced-order  filter  is  necessary  for  implementation  because  (14)  cannot 
be  used  due  to  the  null  space  of  S.  Since  S  is  non-negative  definite,  there  exists  a  state 
transformation  F  such  that 

r'sr-[^  “]  (18) 

where  S  is  positive  definite.  Theorem  6.1  provides  a  way  to  form  the  transformation. 

Theorem  6A. 

There  exists  a  state  transformation  T  where 

[ZK«S].r[^„'  zj 
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Z  is  aiiy  n  x  (n  —  ^2)  continuously  differentiable  matrix  such  that  itself  and  Ker  S  span  the  state 
space  where  n  =  dim  iT  and  ^2  ~  dim(Ker  S).  Zi  and  Z2  are  any  (n  fe2)  ^  ^2)  ^2  ^  ^2 

invertible  continuously  differentiable  matrices,  respectively.  Then,  the  F  obtained  from  (19) 
satisfies  (18). 


Proof. 


KerS 


=  rrM  =>  5rP]  =  o  r^srr®l  =  o 

Z2J  L^2J  L^2_ 


Since  Z2  is  invertible  by  definition  and  F^SF  is  symmetric,  (18)  is  true. 


□ 


Note  that  Theorem  6.1  does  not  define  F  uniquely  and  F  can  be  computed  a  priori  because 
Ker  S  can  be  obtained  a  priori. 

By  applying  the  transformation  to  the  estimator  state,  F“  ‘x  a  »;  =  [fjj  By  multiplying 
(14)  by  F’^  from  the  left,  using  FF"’  ==  I,  and  adding  F^SFF"*x  to  both  sides,  the  limiting  filter 
can  be  transformed  into  two  equations, 


=  SiAn  -  Fn)»7.  +  ^(^12  -  F,2)^2  +  SMiU 

+  iSG^iDldV- ^C^D^)- ‘DiCiF- 1  ‘J?](y  -  Ci»ji  -  C2ij2)  (20a) 

0  =  'R{y  -  Cifji  -  €2^2)  (20b) 

where 


Note  that  T  ^  and  f  can  be  computed  a  priori  from  (19).  From  (20b), 


SC2  =  0  (21) 

because  y  —  Ci^i  —  Czfji  is  arbitrary.  By  multiplying  (15)  by  from  the  left  and  T  from  the 
right,  subtracting  f^SF  and  from  both  sides,  and  using  FF  ^  =  /,  the  limiting  Riccati 

equation  can  be  transformed  into  two  equations, 

0  =  ^[^i2-F,2-G2(DlClF-‘C2D2)-‘DlClF-‘C2]  (22) 
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-5  =  ^[>ln  -  r„  -  Gi(DlClK-'C2D2)-‘DlClF-‘C,] 

+  LAii  -  r„  -  Gi{DlClV-^C2D2)~^DlClV-^CifS 

+  ^[-NiQ.JVT  +  G2{DlClV-^C2D2)-^G]-]S-CWV-^flC,  (23) 

By  substituting  (21)  and  (22)  into  (20a),  the  reduced-order  limiting  generalized  least-squares  fault 
detection  filter  is 

=(/iu-ru)i7, +Mi«-t-[G,(i)IClF-'C2Z)2)"‘£»IClF-‘ +  S-‘CIH'^K-‘H](>-C,»7i) 

(24) 

Note  that  Fn  can  be  computed  a  priori.  In  the  limit,  the  residual  (3)  becomes 

r  =  J?(y  —  Cifji)  (25) 

because  SC2  =  0  from  (21)  and  Ker  ft  =  Ker  R. 


7.  EXAMPLE 

In  this  section,  two  numerical  examples  are  used  to  demonstrate  the  performance  of  the 
generalized  least-squares  fault  detection  filter.  In  Section  7.1,  the  filter  is  applied  to  a  time 
-invariant  system.  In  Section  7.2,  the  filter  is  applied  to  a  time-varying  system. 


7.1.  Example  1 

In  this  section,  two  cases  for  a  time-invariant  problem  are  presented.  The  first  one  shows  that  the 
sensitivity  of  the  filter  (8)  to  the  nuisance  fault  decreases  when  y  is  smaller.  The  second  one  shovvs 
that  the  sensitivity  of  the  reduced-order  limiting  filter  (24)  to  the  target  fault  increases  when  Qi  is 
larger.  The  system  matrices  are 


A  = 


0  3  4 
1  2  3 
0  2  5 


‘o' 

‘5' 

*0 

1 

0' 

,  fl  = 

0 

1 

0 

0 

1 

.1. 

.1. 

In  the  first  case,  the  steady-state  solutions  to  the  Riccati  equation  (9)  are  obtained  with 
weightings  chosen  as  Qi  =  1,  (22  =  1,  and  F  =  /  when  y  =  10"“  and  10"*,  respectively.  The  top 
two  figures  of  Figure  1  show  the  frequency  response  from  both  faults  to  the  residual  (3).  The  left 
one  is  y  =  10"^  and  the  right  one  is  y  =  10~*.  The  solid  lines  represent  the  target  fault,  and  the 
dashed  lines  represent  the  nuisance  fault.  This  example  shows  that  the  nuisance  fault  transmission 
can  be  reduced  by  using  a  smaller  y  while  the  target  fault  transmission  is  not  affected. 

In  the  second  case,  the  steady-state  solutions  to  the  reduced-order  limiting  Riccati  equation 
(23)  are  obtained  with  F  =  10"*/  when  61  =  0  and  0.0019,  respectively.  The  lower  two  figures  of 
Figure  1  show  the  frequency  response  from  the  target  fault  and  sensor  noise  to  the  residual  (25). 
The  left  one  is  Qi  =  0,  and  the  right  one  is  gi  =  0.0019.  The  solid  lines  represent 
the  target  fault,  and  the  dashed  lines  represent  the  sensor  noise.  This  example  shows  that  the 
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Figure  1.  Frequency  response  of  the  residual. 


sensitivity  of  the  filter  to  the  target  fault  can  be  enhanced  by  using  a  larger  Qi.  The  sensor  noise 
transmission  also  increases  because  part  of  the  sensor  noise  comes  through  the  same  direction  as 
the  target  fault.  However,  the  sensor  noise  transmission  is  small  compared  to  the  target  fault 
transmission.  In  this  case,  the  nuisance  fault  transmission  stays  zero  and  is  not  shown  in  these 
figures.  Note  that  when  Qt  =  0,  the  generalized  least-squares  fault  detection  filter  is  similar  to 
Reference  [2]  which  does  not  enhance  the  target  fault  transmission. 

7.2.  Example  2 

In  this  section,  the  filter  (8)  and  the  reduced-order  limiting  filter  (24)  are  applied  to  a  time- varying 
system  which  is  from  modifying  the  time-invariant  system  in  the  previous  section  by  adding  some 
time- varying  elements  to  A  and  matrices  while  C  and  Fj  matrices  are  the  same: 


’  —cos  t  3  +  2  sin  t  4 

5  —  2  cos  £ 

A  = 

1  2  3  —  2  cos  r 

,  F2  = 

1 

5  sin  t  2  5  +  3  cos  t_ 

1  +  sin  £ 

The  Riccati  equation  (9)  is  solved  with  6i  =  1,  Q2  =  1.  =  f  and  y  =  10“ *  for  I  e [0,  25].  The 

reduced-order  limiting  Riccati  equation  (23)  is  solved  with  the  same  Q\  and  V.  Figure  2  shows  the 
time  response  of  the  norm  of  the  residuals  when  there  is  no  fault,  a  target  fault  and  a  nuisance 
fault,  respectively.  The  faults  are  unit  steps  that  occur  at  the  fifth  second.  In  each  case,  there  is  no 
sensor  noise.  The  left  three  figures  show  the  residual  (3)  for  the  filter  (8).  There  is  a  small  nuisance 
fault  transmission  because  (8)  is  an  approximate  unknown  input  observer.  The  right  three  figures 
show  the  residual  (25)  for  the  reduced-order  limiting  filter  (24).  Note  that  the  nuisance  fault 
transmission  is  zero.  This  example  shows  that  both  filters,  (8)  and  (24),  work  well  for  time-varying 
systems. 
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Figure  2.  Time  response  of  the  residual 


8.  CONCLUSION 

The  generalized  least-squares  fault  detection  filter  is  derived  from  solving  a  min-max  problem 
which  makes  the  residual  sensitive  to  the  target  fault,  but  insensitive  to  the  nuisance  faults.  In  the 
limit  where  the  weighting  on  the  nuisance  faults  is  zero,  the  filter  becomes  equivalent  to  the 
unknown  input  observer  which  places  the  nuisance  faults  into  a  minimal  (C,  ^)-unobservability 
sub  space  and  there  exists  a  reduced-order  filter.  Since  the  target  fault  is  explicit  in  the  problem 
formulation,  the  sensitivity  of  the  filter  to  the  target  fault  can  be  enhanced.  Filter  designs  can  be 
obtained  for  both  lincar-time-invariant  and  time-varying  systems. 
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Abstract 

A  class  of  robust  fault  detection  filters  is  generalized 
from  detecting  single  fault  to  multiple  faults.  This 
generalization  is  called  the  optimal  stochastic  multiple- 
fault  detection  filter  since  in  the  formulation,  the  un¬ 
known  fault  amplitudes  are  modeled  as  white  noise. 
The  residual  space  of  the  filter  is  divided  into  several 
subspaces  and  each  subspace  is  sensitive  to  only  one 
fault  (target  fault),  but  not  to  other  faults  (nuisance 
faults),  in  the  sense  that  the  transmission  from  nui¬ 
sance  faults  to  the  target  residual  space  is  small  while 
the  transmission  from  target  fault  is  large.  It  is  shown 
that  this  filter  approximates  the  properties  of  the  clas¬ 
sical  fault  detection  filter  such  that  in  the  limit  where 
the  nuisance  fault  weighting  goes  to  infinity,  the  opti¬ 
mal  stochastic  multiple-fault  detection  filter  is  equiva¬ 
lent  to  the  Beard- Jones  fault  detection  filter  when  there 
is  no  complementary  subspace.  A  numerical  example 
also  shows  that  this  filter  is  an  approximate  Beard- 
Jones  fault  detection  filter  even  when  complementary 
subspace  exists.  This  filter  combines  the  advantages  of 
the  robust  single-fault  detection  filter  and  Beard- Jones 
fault  detection  filter. 

1  Introduction 

Any  system  under  automatic  control  demands  a  high 
degree  of  system  reliability  and  this  requires  a  health 
monitoring  system  capable  of  detecting  any  system,  ac¬ 
tuator  and  sensor  fault  as  it  occurs  and  identifying  the 
faulty  component.  One  approach,  analytical  redun¬ 
dancy,  uses  the  modeled  dynamic  relationship  between 
system  inputs  and  measured  system  outputs  to  form 
a  residual  process  used  for  detecting  and  identifying 
faults.  Nominally,  the  residual  is  nonzero  only  when  a 
fault  has  occurred  and  is  zero  at  other  times. 

A  popular  approach  to  analytical  redundancy  is  the 
detection  filter  which  was  first  introduced  by  [1]  and 
refined  by  [2].  It  is  also  known  as  the  Beard- Jones 
fault  detection  filter.  A  geometric  interpretation  and  a 
spectral  approach  of  this  filter  are  given  in  [3]  and  [4], 
respectively.  Design  algorithms  have  been  developed 

^This  work  was  sponsored  by  Air  Force  Office  of  Scientific  Re¬ 
search,  Award  No.  F49620-97-1-0272  and  NASA-Ames  Research 
Center,  Cooperative  Agreement  NCC2-374,  Supplement  19 

0-7803-5250-5/99/$  10.00  ©  1999  IEEE 


[5,  6]  which  improved  detection  filter  robustness.  The 
idea  of  a  detection  filter  is  to  put  the  reachable  subspace 
of  each  fault  into  invariant  subspaces  which  do  not  over¬ 
lap  with  each  other.  Then,  when  a  nonzero  residual  is 
detected,  a  fault  can  be  announced  and  identified  by 
projecting  the  residual  onto  each  of  the  invariant  sub¬ 
spaces.  Therefore,  multiple  faults  can  be  monitored  in 
one  filter. 

Another  related  approach,  the  unknown  input  observer 
[7],  simplifies  the  detection  filter  problem  by  dividing 
the  faults  into  a  target  fault  and  nuisance  fault  group 
where  the  nuisance  faults  are  placed  into  one  invariant 
subspace.  Although  only  one  fault  can  be  detected  in 
each  unknown  input  observer,  additional  flexibility  in 
fault  detection  filter  design  for  robustness  and  time- 
varying  system  is  obtained  by  using  an  approximate 
fault  detection  filter  [8,  9,  10,  11,  12], 

In  this  paper,  an  extension  of  the  optimal  stochastic 
fault  detection  filter  [11]  is  presented.  The  optimal 
stochastic  fault  detection  filter,  which  is  an  approx¬ 
imate  unknown  input  observer,  allows  additional  rc> 
bustness  in  the  fault  detection  filter  design.  However, 
it  can  detect  only  one  fault  in  each  filter.  In  contrast, 
the  Beard- Jones  fault  detection  filter  can  detect  multi¬ 
ple  faults  ill  one  filter,  but  is  not  very  robust.  Prom  the 
problem  formulation  of  the  optimal  stochastic  fault  de¬ 
tection  filter,  it  seems  natural  that  the  multiple  faults 
objective  may  be  achieved.  This  is  done  by  dividing  the 
residual  space  of  the  filter  into  several  subspaces  by  pro¬ 
jectors  and  having  each  subspace  sensitive  to  only  one 
fault  (target  fault),  but  not  to  other  faults  (nuisance 
faults),  in  the  sense  that  the  transmission  from  nui¬ 
sance  faults  to  the  target  residual  space  is  small  while 
the  transmission  from  target  fault  is  large.  In  the  limit 
where  the  nuisance  fault  weighting  goes  to  infinity  and 
in  the  absence  of  sensor  noise,  it  is  shown  that  the  opti¬ 
mal  stochastic  multiple-fault  detection  filter  becomes  a 
Beard- Jones  fault  detection  filter  when  there  is  no  com¬ 
plementary  subspace.  Note  that  the  Woo  bounded  fault 
detection  filter  [6]  imposed  the  detection  filter  struc¬ 
ture  constraint  while  the  detection  filter  structure  is 
generated  from  the  problem  formulation  of  the  opti- 
mad  stochastic  multiple-fault  detection  filter.  Also,  a 
numerical  example  shows  that  this  filter  is  an  approxi- 
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mate  Beard-Jones  fault  detection  filter  when  it  is  not  in 
the  limit  even  with  the  existence  of  the  complementary 
subspace. 

The  problem  is  formulated  in  Section  2  and  the  solution 
is  derived  in  Section  3.  In  Section  4,  the  filter  is  derived 
for  the  limiting  case  when  there  is  no  complementary 
subspace.  In  Section  5,  a  numerical  example  is  given. 

2  Problem  Formulation 

In  this  section,  the  fault  detection  filter  problem  is  for¬ 
mulated.  From  [1,  3,  4,  8],  a  linear  time-invariant, 
(C,  j4)  observable  system  with  q  plant,  actuator  and 
sensor  faults  can  be  modeled  by 

i  =  +  Bu  -f  ^  FiHi  (la) 

»=i 

y  =  Cx  +  v  (lb) 

where  u  is  control  input,  y  is  measurement  and  v  is 
sensor  noise.  The  failure  modes  Hi  are  vectors  that 
are  unknown  and  arbitrary  functions  of  time  and  are 
zero  when  there  is  no  failure.  The  failure  signatures  Fi 
are  maps  that  are  known.  A  failure  mode  fit  models 
the  time-varsdng  amplitude  of  a  failure  while  a  failure 
signature  Fi  models  the  directional  characteristics  of  a 
failure.  Assume  the  Fi  are  monic  so  that  fuj^O  implies 
Fifii  ^  0. 

There  are  two  assumptions  about  system  (1)  in  order 
have  a  well-conditioned  fault  detection  filter.  Assump¬ 
tion  2.1  ensures  the  separation  of  faults  /Xj,  i  =  1,  •  •  •  ,9 
[3,  8].  Assumption  2.2  ensures  a  nonzero  residual  in 
steady  state  when  the  target  fault  occurs  [11]. 

Assumption  2.1.  Fi,  •  •  •  ,  F,  are  output  separable. 

Assumption  2.2.  (C,  A,  F),  i  =  1,  •  •  •,  9,  do  not  have 
transmission  zeros  at  origin. 

Assume  /ii,i  =  I,--  -  ,9,  and  v  are  zero  mean,  white 
Gaussian  noise  with 

E[v{t)v{Tf]  =  V6it  -  t)  (2b) 

and  F(x(fo)x(to)^]  =  Fo-  Also.  Pi.i  =  1,  •  •  •  ,9,  and  v 
are  uncorrelated  with  each  other  and  with  x(to).  For 
simplicity,  the  following  notation  is  made  for  use  later. 

Pi  =  [  Ml  Mi-1  Mi+i  •••  M9  ] 

Fi  =  [Fi  •••  Fi_i  Fi+i  •••  F,  ] 

Qi  =  F[Mi(t)Mi(0’’] 


The  objective  of  the  optimal  stochastic  multiple-fault 
detection  filter  problem  is  to  find  a  filter  gain  L  for  the 
linear  observer, 

X  =  At  +  Bu  +  L{y  -  Cx) 
and  the  residual, 

r  =  y-Cx  (3) 

such  that  each  projected  residual  Hir  is  affected  es¬ 
sentially  only  by  its  target  fault  /ii,  and  minimally  by 
its  nuisance  fault  sensor  noise  t;  and  initial  condi¬ 
tion  error  x(to)  -  x(to)-  Hi  aie  projectors  also  used  by 
the  Beaxd-Jones  fault  detection  filter  which  map  the 
reachable  subspace  of  fit  to  zero. 

H^iy^y  ^  Ker Hi  =  Cfi 

where  Tl  is  the  minimal  (C,  A)-unobservability  sub¬ 
space  of  Fi  with  fci  =  dim7i.  A  minimal  {C,Ay 
unobservability  subspace  [3,  7]  implies  that  there  is  a 
projector  H  induced  from  the  fault  directions  such  that 
{HC,  A  —  LC)  has  an  unobservable  subspace  for  some 
filter  gain  L,  The  error,  e  =  x  -  x,  can  be  written  as 

c(t)  =  $(t ,  to)e(f o)  +  t(f,  r)  Fifii  -  Lv^  dr  (4) 

subject  to 

4$(t,  to)  =  (A  -  LC7)$(f,  to)  .  $(fo,  to)  =  I  (5) 
at 

And  the  residual  (3)  becomes  r  —  Ce-\-v. 

Now  a  performance  index  is  needed  for  deriving  the 
filter  gain  L.  It  seems  that  the  most  natural  choice 
is  to  have  the  performance  ^rdex  be  associated  with 
the  residual  (i.e.,  Hi{Ce  +  v)),  However,  it  is  unus¬ 
able  from  statistical  viewpoint  since  the  variance  of  the 
residual  generates  a  ^-function  due  to  the  sensor  noise. 
The  next  choice  is  for  the  performance  index  to  be  asso¬ 
ciated  with  the  output  space  3^  (i.e.,  HiCe).  However, 
a  unique  solution  can  not  be  obtained  from  minimizing 
the  performance  index  associated  with  the  output  space 
because  the  information  on  the  null  space  of  C  is  not 
available.  This  will  become  clear  in  Section  3,  There¬ 
fore,  the  performance  index  will  be  associated  with  the 
state  space  (i.e.,  Hic)  which  means  the  influence  of 
fii,  V  and  e(to)  on  Hyt  is  minimized  while  the  influence 
of  /it  is  maximized.  The  Hi  associated  with  Hi  is 

Hi-.X^X  ,  KtiHi=fi ,  Hi=I-fi[T^fi\-^T7  (6) 

Note  that  Beard-Jones  fault  detection  filter  also  works 
on  the  state  space  by  assigning  the  eigenstructures.  In 
Section  4,  it  will  be  shown  that  the  projectors  (6)  will 
minimize  the  performance  index  in  the  limit. 
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Assumption  2.3.  The  invariant  zero  directions  asso¬ 
ciated  with  the  invariant  zeros  of  (C,  A,  Fj)  on  the  left- 
half  plane  are  included  with  the  fault  direction  Ft  to 
produce  the  minimal  (C,  A)-unobservability  subspace 
of  Fi,  %. 


Remark  1.  From  [3],  all  invariant  zero  direc¬ 
tions  of  {C,  A,  Fi)  have  to  be  included  in  %  or 
the  invariant  zeros  will  be  part  of  the  eigenvalues 
of  the  filter.  From  the  approach  of  [8],  the  invari¬ 
ant  zero  directions  associated  with  the  invariant  ze¬ 
ros  on  the  right-half  plane  and  imaginary  axis  are 
automatically  included  in  %.  FVom  (11,  12],  the  in¬ 
variant  zero  directions  associated  with  the  invariant 
zeros  on  the  left-half  plane  will  also  be  included  in 
Ti  only  if  the  fault  direction  Fj  is  modified.  « 


Define 


hi{t)  =  Hi 

/  ^{t,T)FiHidT 

Ito 

(7a) 

hi{t)  =  Hi 

f  ^t,T)Fiiiidr 

Jto 

(7b) 

hiv{t)  =  Hi 

^{t,to)e{to)  -  $(f,T)LT;dTj 

(7c) 

From  (4),  hi  represents  the  transmission  firom  target 
fault  to  part  of  the  error  HiC.  hi  represents  the  trans¬ 
mission  from  nuisance  faults  to  HiC.  hiv  represents  the 
transmission  from  sensor  noise  and  initial  condition  er¬ 
ror  to  Hie.  Since  the  objective  is  to  pick  a  filter  gain  L 
such  that  each  Hie  is  sensitive  only  to  its  target  fault, 
but  not  its  nuisance  fault,  sensor  noise  and  initial  con¬ 
dition  error,  hi  and  /ii„  are  expected  to  be  small  while 
hi  to  be  large.  This  can  be  formulated  as  a  non-convex 
minimization  problem. 


min  J  =  min- - — 

L  Lf  tr  to 


-E 


dt 


where  *7^0  and  is  the  hnal  time.  The  trace  operator 
is  used  because  the  variance  is  a  matrix. 


3  Solution  to  the  Disturbance  Attenuation 
Problem 


To  put  the  optimization  problem  in  a  more  transparent 
context,  J  is  manipulated  by  adding  zero  term 

7^  rtr|vHi[$(t,t)F(t)$(t,tf- 

$(t,  to)Pi{to)^it,  *0)’’-^ 

Then,  the  problem  can  be  rewritten  as 

min  J  =  min  ^ 

L  L  ti-toAo  7  [“L  Jto 

(L^PiC'^V-'^)V{Ir^PiC^V-Y^t,  rf  drHi] }  dt 

=  min  r  -tr  (  V  )  dt  (8) 

L  h-to  Jto  1  j 

subject  to 

Pi=APi+PiA^-^PiC^V-^CPi  I  ^•y^-FjQiFf  (9) 

Wi  =(A  -  LC)Wi  +  Wi{A  -  LC)'^ 

+  {L-iPiC'^V-'^)V{L-^PiP'^V-'^)'^  (10) 

where  Pi(to)  =  Po/7  and  Wi{to)  =  0.  The  term 
ftQ  is  dropped  because  Hi  is 

not  being  optimalized  here  but  chosen  as  in  (6).  See 
[11]  for  extension. 

The  variational  Hamiltonian  of  the  problem  is 

n=tTi^^{HiWiHi+/Ci[iA-LC)Wi+Wi{A-LC)'^' 

+{L  -  'rPiC'^v-^)V{L  -  iPiC^v-Y]]}  (11) 

where  )Ci{t)  €  7^”^”  is  a  continuously  differentiable 
matrix  Lagrange  multiplier.  Note  that  P*,  i  =  1,  *  •  ■  ,  g, 
are  independent  of  L.  The  first-order  necessary  con¬ 
ditions  imply  that  the  optimal  strategy  for  L  and  dy¬ 
namics  for  )Ci  are 

=f2[-'^<^Wif^i+‘^V{L'‘^PiC'^V-Yf^i]  =0  (12) 

uL 

»=i 

r=(^/Ci)-ME^<(7F<  +  (13) 

t=l  i=l 

-lti=  ^=Hi+Ki{A-LC)+{A-LC)'^Ki=0  (14) 

OWi 


By  using  (2)  and  (7),  the  cost  J  can  be  written  as 
FiQiFT)^{t,  T)'^dTHi+Hi^t,  to)^$(f,  to)^i?i]  }  dt 


where  /Ci(ti)  ==  0,  Note  that  Ki  =  tCf.  Although  the 
optimal  filter  gain  L*,  (13)  subject  to  (10),  (14)  and 
(9),  requires  the  solution  to  a  two-point  boundary  value 
problem,  it  can  be  computed  off-line. 

More  realistically,  the  infinite-time  case  allows  a  time- 
invariant  L*  where  2g  algebraic  Lyapunov  equations 
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(10)  and  (14)  (i.e.,  Wi  =  =  0),  coupled  by  (13),  are 

to  be  solved.  An  alternative  is  to  use  a  gradient  method 
to  numerically  solve 

lim  min  J  =  mintr  (  -  ^  HiWiHi ) 

t,-io-*oo  L  L  y'y  J 

where  W)  is  the  solution  to  the  algebraic  (10). 

Remark  2.  The  stability  of  the  filter  depends 
on  the  existence  of  L*.  If  there  is  a  L*  such  that 
the  cost  J  is  at  its  minimum,  A  —  LC  has  to  be 
stable  otherwise  J  will  become  unbounded.  « 

Remark  3.  If  the  cost  J  is  associated  with 

the  output  space  (i.e.,  H  m  H  (11)  is  replaced 
by  HC),  the  Lagrange  multiplier  /Ci(f)  € 
and  therefore  £*  is  not  unique  from  (12).  « 

4  Limiting  Case 

In  this  section,  the  limit  of  the  optimal  stochas¬ 
tic  multiple-fault  detection  filter  is  investigated  where 
V— »0as7— ♦Oin  such  a  way  that  7!^"^  — »  V~^.  Note 
that  V  has  to  go  to  zero  as  7  0  because  physic^y 

sensor  noise  will  produce  a  nonzero  transmission.  Sim¬ 
ilarly,  the  variance  of  the  initid  condition  error  Po-^0 
as  7  — ♦  0  such  that  'fP^^  —*  Ho-  The  mfimte-time  L* 
(13)  will  be  simplified  for  the  limiting  case  and  com¬ 
pared  to  Beard-Jones  fault  detection  filter.  Assump¬ 
tion  4.1  is  used  to  simplify  L*  and  the  following  anal¬ 
ysis. 

Assumption  4.1.  There  is  no  complementary  sub¬ 
space. 

Therefore,  I3i=i  ^  spans  the  state  space  X  where  TJ 
is  the  minimal  (C,  A)-unobservability  subspace  of  Fi. 
Assume, 

(A  -  LC)Ti  C  Ti  (15) 

for  t  =  1, •  •  •  ,9,  which  will  be  shown  in  Theorem  4.6. 
Lemmas  4.1  and  4.2  show  that  has  similar  properties 
to  Hi. 

Lemma  4.1.  ICiTl  =  0. 


From  (15),  let  Vij,  j  =  I,  -  -  ,fci,  span  such  that 
(A  -  LC)vij  =  ffijiiij.  Then,  (16)  becomes 

[aijI  +  {A-LCf]KiVij  =  0 

which  implies  =0  because  A-LC  has  to  be  sta¬ 
ble. 

Lemma  4.2.  If  there  is  no  complementary  subspace, 
Ki{A  -  LC)  =  (A  -  LC)Ki. 

Proof.  Prom  (15),  let  Vij,  y  =  span  % 

such  that  (A  -  LC)vij  =  o-yUy.  From  Lemma  4.1,  let 
K-iVkj  =  kijVij  a  k  =  i  or  0  a  k^i.  Prom  Assump¬ 
tion  4.1,  any  element  in  the  state  space  X  can  be  rep¬ 
resented  by  Then, 

9  fci  _ 

Ki(A  -  I»C)(^  OtijVjj)  =  y*! 

i=i  j=i  j=i 

Q 

(A  -  LC)Ki(£,  E  )  =  E 

i=i  i=i  i=i 

]mv\yKi{A-LC)  =  {A-LC)Ki.  * 

From  Lemma  4.2  and  algebraic  form  of  (14), 

Ki^-U-LC)  +  {A-LC)'^r^Hi  (17) 

where  (A  -  IXT)  +  (A  -  LC)^  is  invertible  because  A- 
LC  has  to  be  stable.  By  substituting  (17)  into  (13), 

L-  =  (£Hi)-^\j^Hi{iPi  +  Wi)\C'^V-^  (18) 

isl 

Lemmas  4.3  and  4.4  are  important  projector  properties 
for  Lemma  4.5  which  will  be  used  to  simplify  L*  (18). 

Lemma  4*3.  There  exists  a  state  transformation  F, 

■  Ml  0  0  “ 

[  Ti  •••  Ti  ]  =r  0  0 

0  0  Mg 


where  M<,i  =  1, •••  ,g,  are  any  invertible  fci  x  fcj  ma¬ 
trices,  such  that 


Proof.  By  multiplying  algebraic  form  of  (14)  by  vy  Proof.  Since  Hi  (6)  has  a  null  space  %, 

from  the  right,  F  0  1 

.r.  ,  ^  KerRi  =  [T2  •••  T,  ]  =r 

Ki{A  -  LC)vij  +  (A  -  LC)^KiVij  =  0  (16)  L  J 
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(20) 


where  A/i  is  a  block  diagonal  matrix  with  diagonal  ma¬ 
trix  elements  M2,  •  •  •  ,  M,,  then 

0 


HiT 


0 

Ml 


=  0  =i>  r‘HiT 


Ml 


=  0 


Since  Mi  is  not  zero  by  definition  and  Is  sym¬ 

metric, 

^  “  [  0  0 

Similarly,  i  =  2,  •••,</,  can  be  proved.  « 

Lemma  4.4. 

f  ; 

k=l  ^ 

Proof.  For  i  =  1  and  j  =  2, 

F^iFi(^Hfc)-'H2F 

fc=i 

=  (F’-FiF)(^F^HfcF)-nr^H2r) 


t  =  j 

i  j 


fc=l 


■I'.'  :l 
=i::i 


■  Hi 

0  0 

-1 

■  0 

0 

0  “ 

0 

0 

0 

0  Hg  . 

0 

0 

H2 

0 

0 

0 

0  0  0 
0  iJ2  0 
0  0  0 


=  0 


Therefore,  Hi(Efc=i  Hk)~^H2=0  and  similarly  it  can 
be  shown  for  all  cases.  ^ 


Lemma  4.5.  If  there  is  no  complementaiy  subspace, 

HiWi  =  0. 

Proof.  By  multiplying  algebraic  form  of  (10)  by  Hi 
from  left  and  right,  substituting  L  with  (18)  and  using 
Lemma  4.4, 

Hi{AiWi  +  WiAj  -  WiC'^V-^CWi)Hi  =  0 

where  Ai=A-PiC'^V~^C.  Since  Ai  is  stable  (11,  12], 
the  solution  is  either  Wi=0  or  ImW;  CKerAfi.  There¬ 
fore  HiWi  =  0. 


Theorem  4.6.  In  the  limit,  for  i  —  1,  •  •  ■  ,<}, 

(A  -  LC)Ti  C  Ti 

where  L  =  (E?=i  HiPi)Cf^V-^ 

Proof.  Instead  of  Pi  (9),  using  its  inverse  Hi  = 
is  a  better  way  to  discuss  the  limiting  properties  be¬ 
cause  Hi  has  a  null  space  Ti  in  the' limit  [11, 12].  Then, 
the  filter  gain  becomes 

L  =  (^Hi)-HiH<n-')c’’v-'  (21) 

i=l  i=l 

where 

0  =niA  -I-  -h  Uii^FiQiF^  -  FiQiF^)ni 

-C'^V-^C  (22) 

Note  that  the  infinite  part  of  11"^  is  annihilated  by 

the  projector  Hi  because  KerHj  =  Kern*.  For  t  =  1, 
multiply  (20)  by  112  from  the  left, 

U2{A  -  LC)Ti  =  0 

because  Ti  is  in  the  null  space  of  112  l^^e  limit.  By 
substituting  L  with  (21), 

=^n2Arx  -  « 

i=l  i=l 

=i-n2ATi  -  n2if^Hir^H2U2^c'^v-^cri  =  o 

t=l 

because  Il2(Ei=i  Hi)-^Hj^2  =  0  which  can  be  shown 
similarly  to  Lemma  4.4. 

=>n2ATi  -  C'^V-'^CTi  =  0 

which  is  true  by  multipl3dng  (22)  where  i  =  2  by  Ti 
from  the  right.  Similarly,  it  can  be  shown 

ni(A  -  LC)Ti  =  0 

for  t  =  3,  •  •  •  ,9-  Since  Kern2  n  •  •  •  n  Kerll,  =  Tj  [11, 

12), 

(A  -  LC)Ti  C  Ti 

and  for  i  =  2,  •  •  •  ,  9,  it  can  be  shown  similarly.  « 


By  using  Lemma  4.5,  L*  (18)  becomes 

L-  =  (^iFi)-nE7HiPi)C'’’V'''  (19) 

t=l  i=l 

Theorem  4.6  shows  that  L*  (19)  is  consistent  with  the 
assumption  (15)  in  the  limit  and  therefore  the  limit¬ 
ing  optimal  stochastic  multiple-fault  detection  filter  is 
eoui'valent  to  the  Beard-Jones  fault  detection  filter. 
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Remark  4.  Lemma  4.5  implies  that  the  pro¬ 
jectors  Hi,  (6),  minimize  the  cost  (8).  Therefore, 
(6)  are  the  optimal  projectors  in  the  limit.  • 

Remark  5.  (19)  shows  a  limiting  property  of  the 

optimal  stochastic  multiple-fault  detection  filter.  How¬ 
ever,  the  optimal  filter  gain  can  not  be  derived  when 


-Y  is  zero  because  the  filter  gain  depends  on  the  in¬ 
verse  of  V  which  is  zero.  Therefore,  only  an  ap¬ 
proximate  Beard-Jones  fault  detection  filter  can  be 
derived  when  7  is  small.  However,  when  a  full- 
order  Beard-Jones  fault  detection  filter  reduces  to 
a  few  reduced-order  filters,  these  reduced-order  fil¬ 
ters  can  be  recovered  by  taking  the  optimal  stochas¬ 
tic  single- fault  detection  filter  to  the  limit  [12].  • 


Remark  6.  By  combining  Lemma  4.5  and  HiPi{ti)Hi 
=  0  [11,  12],  the  optimal  stochastic  multiple-fault  de¬ 
tection  filter  satisfies  a  disturbance  attenuation  prob¬ 
lem, 

in  the  limit.  ^ 


5  Example 


In  this  section,  a  numeric^  example  from  [4]  shows  that 
the  minimization  problem  produces  a  fault  detection 
filter  when  there  is  a  complementary  subspace.  The 
system  matrices  are 


[0  3  4 


A  = 


12  3 
0  2  5 


1 


The  power  spectral  densities  are  chosen  as  Qi  —  Q2  —  1 
V  =  10~®  I-  The  disturbance  attenuation  bound 
7  is  10“®.  The  infinite-time  minimization  problem  is 
solved  numerically  by  using  gradient  method  and  the 
frequency  response  of  the  filter  shows  that  the  two 
faults  are  isolated. 


6  Conclusion 

The  optimal  stochastic  multiple-fault  detection  filter 
is  a  generalization  from  the  single-fault  filter.  The 
residual  space  of  the  filter  is  divided  into  several  sub¬ 
spaces  and  each  subspace  is  sensitive  to  only  its  target 
fault,  but  not  the  nuisance  faults,  in  the  sense  that  the 
transmission  from  nuisance  faults  to  the  target  residual 
space  is  small  while  the  transmission  from  target  fault 
is  large.  In  the  limit  as  the  nuisance  fault  weighting 
goes  to  infinity  and  in  the  absence  of  sensor  noise  and 
a  complementary  subspace,  this  filter  is  equivalent  to  a 
Beard-Jones  fault  detection  filter  which  puts  each  fault 
into  an  unobservable  subspace.  This  filter  has  the  ad¬ 
vantages  of  the  unknown  input  observer  in  that  it  can 
be  designed  for  robustness  and  the  advantages  of  the 
Beard-Jones  fault  detection  filter  by  being  capable  of 
detecting  multiple  faults  in  one  filter.  Although  there 


is  additional  computation  to  determine  the  filter  gain, 
this  can  be  done  off-line  so  that  implementation  is  as 
straightforward  as  the  Beard-Jones  fault  detection  fil¬ 
ter. 
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ABSTRACT 

In  this  paper,  we  introduce  the  decentralized  fault  detection  filter,  a  structure  that  results  from  merging  decentralized  esti¬ 
mation  theory  with  the  game  theoretic  fault  detection  filter.  A  decentralized  approach  may  be  the  ideal  way  to  health  monitor 
large-scale  systems,  since  it  decomposes  the  problem  down  into  (potentially  smaller)  “local”  problems  and  then  blends  the 
“local”  results  into  a  “global”  result  that  describes  the  health  of  the  entire  system.  The  benefits  of  such  an  approach  include 
added  fault  tolerance  and  easy  scalability.  An  example  given  at  the  end  of  the  paper  demonstrates  the  use  of  this  filter  for  a 
platoon  of  cars  proposed  for  advanced  vehicle  control  systems. 


Introduction 

Observers  play  a  central  role  in  an  important  class  of  techniques  for  fault  detection  and  identification  (FDI).  Since  failures  act  as 
nexpected  inputs,  they  will  bias  the  error  residuals  of  any  observer  designed  about  the  nominal  system.  Moreover,  because  of  their 
closed-loop  nature,  observers  are  able  to  maintain  nonzero  residuals  for  indefinite  periods  of  time  after  the  occurence  of  a  failure  ,  and 
they  possess  reduced  sensitivity  to  model  mismatch,  nonlinearities,  and  exogenous  disturbances  inherent  to  feedback  systems. 

There  are  two  types  of  observers  currently  used  for  FDI  purposes.  The  first  is  known  as  the  Beard-Jones  Fault  Detection  Filter 
(White  and  Speyer,  1987;  Massoumnia,  1986;  Douglas,  1993).  This  filter  is  a  variation  of  the  Luenberger  Observer  in  which  nonover¬ 
lapping  invariant  subspaces  have  been  built  around  the  reachable  subspaces  of  the  failures  modelled  in  the  system.  The  influence  of 
any  one  of  these  failures  is  restricted  to  its  own  particular  subspace,  which  allows  for  simultaneous  detection  and  identification.  That  is, 
projecting  the  error  residual  onto  each  of  these  invariant  subspaces  one-by-one,  a  failure  is  detected  when  the  projection  is  nonzero  and 
identified  by  the  subspace  corresponding  the  nonzero  projection. 

The  second  type  of  FDI  observer  is  known  as  the  unknovm  input  observer.  In  this  observer,  the  set  of  modelled  faults  is  divided 
into  two  groups:  the  faults  to  be  detected  and  the  faults  that  are  to  be  ignored.  The  former  is  made  distinguishable  from  the  latter 


•niis  work  was  spo^  by  NASA-Ames  Research  Center  Cooperative  Agreement  No.  NCC  2-374.  supplement  19.  Air  Force  Office  of  Scientific  Research  Grant 
F49620-97- 1-0272,  ana  California  Department  of  Transportation  Agreement  No.  65H  978,  MOU  126 
•a  distinct  advantage  over  open-loop  FDI  methods  (White  and  Speyer,  1987) 
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by  constructing  an  output  through  which  the  latter  set  is  unobservable.  Detection  is  then  achieved  when  this  output  is  nonzero  and 
identification  is  trivial  because  we  are  only  trying  to  detect  one  set  in  the  possible  presence  of  the  other.  The  unknown  input  observer  is 
clearly  less  capable  than  the  Beard-Jones  filter,  but  its  relatively  simple  structure  allows  for  easy  approximation  by  optimization  methods 

(Ding  and  Frank.  1989;  Chung  and  Speyer,  1998). 

As  l»<h  ofth.se  approsches  have  become  more  lelinetl.  applications  have  begun  to  be  seen  in  the  literature  for  systems  as  varietl  as 
jet  engines  (Patton  and  Chen.  1992).  missile  guidance  (Bowman  and  Speyer.  1987).  nuclear  reactors  (Patton  et  al..  1991).  and  automated 
highways  (Douglas  et  al..  1995. 1996).  With  the  advent  of  applications,  however,  new  issues  related  to  implementation  have  come  to  the 
forefront.  In  this  paper,  we  will  look  at  some  of  the  challenges  inherent  to  detecting  faults  in  latge-scale  systems.  For  such  systems,  a 

decentralized  fault  detection  filter  may  be  the  logical  approach  to  the  problem. 

The  decentralized  fault  detection  filter  is  the  result  of  combining  the  game  theoretic  fault  detection  filter  of  Chung  and  Speyer  (1998) 
with  the  decentralized  filtering  algorithm  introduced  by  Speyer  (1979)  and  extended  by  Willsky  et  al.  (1982).  It  approximates  the  actions 
of  an  unknown  input  observer  and  is  formed  by  combining  the  estimates  of  several  “local”  estimators  (each  driven  by  independent 
measurement  sets).  For  large-scale  systems,  it  simplifies  the  health  montoring  problem  by  decomposing  it  down  into  a  collection  of 
smaller  problems.  For  other  systems  like  a  platoon  of  cars  (Douglas  et  al..  1996;  Wolfe  et  al..  1996)  or  a  formation  of  airplanes,  its 
deccntralizedstructure  reflects  the  actual  physical  structure  of  the  system.  And  further,  it  introduces  scalability  for  circumstances  such 
as  when  a  car  joins  the  platoon  or  when  an  airplane  drops  out  of  formation  for  repairs.  Finally,  the  decentralize  fault  detection  filter  has 
built  in  fault  tolerance  in  that  sensors  can  be  checked  and  validated  prior  to  their  measurements  being  blended  into  the  global  estimate 

(Kerr.  1985). 

Decentralized  Estimation  Theory  and  Its  Application  to  FDI 
The  General  Solution 

In  this  section  we  will  review  the  basic  results  of  decentralized  estimation  theory.  A  detailed  examination  of  this  theory  is  given  in 
Chung  and  Speyer  (1995). 

Consider  the  following  system  driven  by  process  disturbances  w  and  sensor  noise  v, 

i  =  Ax+Bw,  x{0),x€3C, 
y  =  Cx+v,  yesc^ 
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It  is  desired  to  derive  an  estimate  of  x.  The  standard  approach  is  a  full-order  observer, 


k  =  Ax+L{y-CX),  x(0)=0,  .  (3) 

which  we  will  refer  to  as  a  centralized  estimator.  An  alternative  to  this  method  is  to  derive  the  estimate  with  a  decentralized  estimator. 
In  the  decentralized  approach,  x  is  found  by  combining  estimates  based  upon  “local”  models, 

x^=/IV  +  bV,  x^€!K."\  (y=l...A),  (4) 

y  =  £V-l-v',  iJ=l...N).  (5) 


Together  these  local  models  provide  an  alternate  representation  of  the  original  system,  which  is  referred  to  as  the  “global”  system  for 
purposes  of  clarification.  The  vector  x  is  likewise  called  the  “global”  state.  The  number  of  local  systems  N  is  bounded  above  by  the 
number  of  measurements  in  the  system,  i.e.  N  <m. 

The  global/local  decomposition  is  really  of  only  secondary  importance.  As  Chung  and  Speyer  (1995)  argue,  there  are  no  real 
restrictions  on  how  one  forms  the  global  and  local  models.  The  real  key  to  the  decentralized  estimation  algorithm  is  the  relationship 
between  the  global  set  of  measurements  y  and  the  N  local  sets,  /.  The  two  basic  assumptions  are  that  the  local  sets  are  simply  segments 

of  the  global  set. 


l/J 


and  that  the  local  sets  can  be  described  in  terms  of  both  the  local  state  and  the  global  state.  In  other  words,  yi  can  be  given  by  (5)  or  by 

y=C^x+v^,  {j=l...N). 


Equations  2, 6,  and  7  imply  that 
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and  that 


The  deeenmiieed  estimation  algorithm  falls  out  when  we  attempt  to  estimate  the  global  state  by  first  generating  estimates  of  the 
local  systems  (4)  using  the  local  measurement  sets  y  and  the  local  models  A-' : 

y  =A-'y  +  LV-^^^).  ^>o)=0,  {j=l...N).  (9) 


The  global  state  estimate,  x,  is  then  found  via 


jc=  + 


where  is  a  measurement-dependent  variable  propagated  by 


hJ  =  +  Gi  -  ,  hj{0)  =  0. 


The  constituent  matrices  are  defined  as 


<&  :=  A  -  5)  GJUd, 


The  Gi  matrices  are  “blending  matrices".  Uter.  we  will  suggest  a  method  for  determining  these  matrices.  In  Chung  and  Speyer  (1995), 
it  was  found  that  in  order  to  get  the  same  estimate  using  either  the  decentralized  or  standard  centralized  algorithms,  the  local  and  global 

gains  had  to  be  related  via, 

'L*  0  •••  O' 

•  >  *  • 

0  0  ••• 

In  general,  however,  this  condition  can  not  be  met  because  of  an  insufficient  number  of  equations  required  to  solve  for  the  unknowns. 
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There  is,  however,  one  general  class  of  estimator  for  which  ( 12)  is  satified  almost  automatically.  This  class  is  comprised  of  estimators 
which  take  their  gains  from  Riccati  solutions,  i.e.  Kalman  Filters  (Speyer,  1979;  Willsky  et  al.,  1982)  or  «“  filters  (Jang  and  Speyer, 
1994).  In  this  case,  the  local  gains  are  found  from 

U  =  pj(Ejf{Vj)-\  (13) 

where,  in  the  case  of  the  Kalman  Filter,  the  matrix,  is  the  solution  of  the  Riccati  Equation: 

pi  =  AJpj  +  P^iA^f  +  -  P^{E^f{y^)~^E^P^, 

pj{0)  =  Pi. 

The  matrices,  V-'  and  WK  are  weightings  which  are  taken  to  be  the  power  spectral  densities  of  the  local  disturbances,  and  wK  which 
drive  the  local  systems  (4,5).  For  the  Kalman  Filter  it  is  assumed  that  and  are  white,  Gaussian  signals.  The  initial  condition  is 
chosen  by  the  analyst  based  upon  his  knowledge  of  the  system.  In  the  global  system,  the  global  gain  is 

L-PC^V-\ 

where 

F'  0  •••  0 

0  0  0 

:  0  : 

0  . 

is  restricted  to  a  block  diagonal  form  comprised  of  the  local  weightings  W.  The  matrix  P  is  the  solution  to  the  global  Riccati  Equation. 

P  =  AP  +  PA^ +  B'Wb’^ -PC'^‘UCP,  R(0)  =  Po- 

The  blending  matrix  solution  is  then, 

=  ;  =  (15) 
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where  S-'  is  any  matrix  such  that 


One  can,  in  fad,  always  take  S>  =  (EfjW  where  (£f)'  ia  the  psendo-invetaeof  £>  (Willsky  et  al.,  1982).  Note  that  the  solutions  forCJ 
will  always  exist  for  Riccati-based  observers  so  long  as  pl  is  invertible  or.  equivalently,  positive-definite.  This  will  always  be  the  ease  If 
the  triples,  are  controllable  and  observable  for  each  of  the  local  systems. 

Implications  for  Detection  Filters 

The  analysis  of  the  previous  section  implies  that  we  will  be  able  to  form  a  decentralized  fault  detection  filter  in  the  general  case  only 
if  we  are  able  to  find  a  Riccati-based  observer  which  is  equivalent  to  a  Beard-Jones  Filter  or  unknown  input  observer.  The  most  direct 
way  to  achieve  this  is  to  find  a  linear-quadratic  optimization  problem  which  is  equivalent  to  the  fault  detection  and  identification  problem. 
This  is  an  analog  of  the  famous  inverse  optimal  control  problem  first  posed  by  Kalman  (1964).  In  Chung  and  Speyer  (1998),  however,  it 
is  shown  that  the  Beard-Jones  Filter  gains  do  not  correspond  with  those  derived  from  linear-quadratic  problems.  An  indirect  way  to  get 
a  Riccati-based  observer  is  to  pose  a  linear-quadratice  optimization  problem  which  closely  mimics  the  fault  detection  problem.  Such  a 
problem  was  posed  and  solved  in  Chung  and  Speyer  (1998),  and  we  will  review  the  solution  found  there  in  the  next  section. 

The  Approximate  Fault  Detection  and  Identification  Problem 
Problem  Formulation 

Consider  the  system  given  by  (1,2)  with  the  further  assumption  that  the  state  matrices  have  sufficient  smoothness  to  guarantee  the 
existence  of  derivatives  various  order.  Beard  (1971)  showed  that  failures  in  the  sensors  and  actuators,  and  unexpected  changes  in  the 
plant  dynamics  can  be  modeled  as  additive  signals, 

i  =  Ax-i-Bw-HFi/ti  H - 

Let  n  be  the  dimension  of  the  state-space.  The  n  x  p.-  matrix  F,  i  =  1-9.  is  called  zfaUure  map  and  represents  the  directional 
characteristics  of  the  fth  fault.  The  p/  x  1  vector  p,-  is  the  failure  signal  and  represents  the  time  dependence  of  the  failure.  It  will  always 
be  assumed  that  each  F  is  monic.  i.c.  Fm  0  for  p,-  #  0.  See  (Douglas.  1993;  Chung  and  Speyer,  1998)  for  further  details  on  how  to 
model  failures.  Throughout  this  paper,  we  will  refer  to  pi  as  the  “target  fault”  and  the  other  faults  py,  y  =  2  •  •  •  9,  as  the  “nuisance  faults  . 
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Without  loss  of  generality,  we  can  represent  the  entire  set  of  nuisance  faults  (and.  if  desired,  the  disturbance  w)  with  a  single  map  Fz  and 
vector  fi2' 


X  =  Ax-\-  Fi/il  +  ^2^2* 


Suppose  that  it  is  desired  to  detect  the  occurrence  of  the  failure,  /t,.  in  spite  of  the  measurement  noise,  v.  and  the  possible  presence 
of  the  nuisance  faults,  m.  The  Beard- Jones  Filter  solves  this  problem  by  picking  the  gain  to  a  standard  Luenberger  Observer. 

i  =  Ax+L{y-Ci),  (18) 

so  that  the  reachable  subspaces  of  m  and  fiz  are  in  separate  and  nonintersecting  invariant  subspaces.  Thus,  with  a  properly  chosen 
projector  H  we  can  project  the  filter  residual,  (y  -  Cc).  onto  the  orthogonal  complement  of  the  invariant  subspace  containing’ m  and  get 

a  signal, 

z^H(y-Cx),  (19) 


such  that 


2  =  0  when  fii=0  and  pz  is  arbitrary. 


To  be  useful  for  FDI,  z  must  also  be  such  that 


(20) 


2^0  when/ii^O.  (^^) 

If  we  restrict  ourselves  to  time-invariant  systems.  (21)  will  be  equivalent  to  requiring  the  transfer  function  matrix  between  (s)  and  z(j)2 
to  be  left-invertible.  Uft-invertibility.  however,  is  a  severe  restriction,  and  it  has  no  analog  for  the  general  time-varying  systems  that  we 
want  to  consider  here.  Previous  researchers  (Douglas.  1993;  Massoumnia  et  al.  1989)  have,  in  fact,  only  required  that  the  mapping  from 
Ml(r)  to  z(r)  be  input  observable,  i.e.  z^Ofot  any  pi  that  is  a  step  input.  It  is  then  argued  (Massoumnia  et  al..  1989)  that  with  input 
observability  z  will  be  nonzero  for  “almost  any”  pi,  since  pi  is  unlikely  to  remain  in  the  kemal  of  the  mapping  to  z  for  all  time. 

(j)  and  2(5)  arc  the  LaPIacc  Transfonns  of  die  time-domain  signals  fii  (f )  zndz(t). 
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We  formulate  the  approximate  detection  filter  design  problem  by  requiring  input  observability  and  relaxing  the  requirement  for  strict 
blocking  that  is  implied  by  (20).  We,  instead,  only  require  that  the  transmission  of  tHe  nuisance  fault  be  bounded  above  by  a  pre-set 

level,  Y  >  0- 


Equation  22  is  identical  to  the  disturbance  attenuation  problem  from  robust  control  theory.  We  refer  to  the  solution  to  the  approximate 

detection  filter  problem  as  the  game  theoretic  fault  detection  filter. 

We  complete  our  formulation  of  the  disturbance  attenuation  problem  for  fault  detection  by  constructing  the  projector  H  that  deter¬ 
mines  the  failure  signal  z.  For  time-invariant  systems,  this  projector  is  constructed  to  map  the  reachable  subspace  of  fi2  to  zero  (Beard, 
1971;  Douglas,  1993),  i.e. 

H  =  <23) 


where 


(24) 


The  vector/;-,  i  =  1  •  •  •  pz,  is  the  ith  column  of  Fz,  and  the  integer  P,-  is  the  smallest  natural  number  such  that  CA^>fi  #  0.  The  time-varying 
extension  of  this  result  is 


H  {cmV- 


(25) 


The  columns  of  the  matrix. 


(26) 


are  constructed  with  the  Goh  Transformation  (Chung  and  Speyer,  1998): 


h!(0=//(0. 

h/(O=A(0h/-‘(0-^“‘- 


8 


(28) 
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In  the  time- varying  case,  P,  is  the  smallest  integer  for  which  the  iteration  above  leads  to  a  vector,  such  that  C(/)£>,  (t)  ^  0  for 

all  t  €  [fo.ti].  It  will  be  assumed  that  A(f),C{r),  and  Fiit)  are  such  that  P,  exists.  Since  the  state-space  has  dimension  n,  P;  is  such  that 

0<P;<n-l. 

We  are  now  ready  to  discuss  the  conditions  under  which  the  solution  to  (22)  will  also  generate  an  input  observable  mapping  from 
fi\  to  z.  The  key  requirement  is  that  the  system  be  output  separable.  That  is,  F\  and  Fz  must  be  linearly  independent  and  remain  so  when 
mapped  to  the  output  space  by  C  and  A.  For  time-invariant  systems,  the  test  for  output  separability  is 

rank  [CA®> /i , . . . ,  CA®'’i  fp^ ,  CA^'/i , . . . ,  ]=/»!+  P2-  (29) 

As  in  (24).  fi  is  the  ith  column  of  Fz,  and  p,-  is  the  the  smallest  integer  such  that  0.  Similarly,  fj  is  the  yth  column  of  Fi ,  and  5j 

is  the  smallest  integer  such  that  A^J  fj  ^  0.  The  integer  sum,  pi  -f-  p2.  is  the  total  number  of  columns  in  Fy  and  Fz- 
For  time- varying  systems,  the  output  separability  test  becomes 

rank  [c(r)g®‘  (r), ....  (0.  (t), ....  C(/)65?  (/)  ]  =  Pi  +  P2>  Vt  6  [to,ri],  (30) 

where  the  vectors,  bf  and  b^,  are  found  from  the  iteration  defined  by  (27)  and  (28).  The  initial  vector,  5j,  is  set  equal  to  the  jth  column 
of  Fi ,  and  bj  is  initialized  as  the  /th  column  of  Fz. 

The  following  proposition  given  in  Chung  and  Speyer  (1998),  connects  output  separability  to  input  observability  and  shows  the 
importance  of  the  monicity  assumption: 

Theorem  1  Suppose  that  a  given  filter  satisfies  (22)  and  generates  the  failure  signal,  z,  given  by  (1 9).  If  Fy  and  Fz  are  output  separable 
and  Fy  is  monk,  then  the  mapping,  py  (t)  z{t),  is  input  observable. 

A  Game  Theoretic  Solution 

We  now  turn  our  attention  to  the  disturbance  attenuation  problem  implied  by  (22).  We  begin  by  defining  a  disturbance  attenuation 
function  (Rhee  and  Speyer,  1991), 

D  SUiWHCix-mildt  (31J 

X?  [llM2ll2r-.  +  ll''lPv-.]*  + 1  W^o)  ’ 
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Daf  is  simply  a  ratio  of  the  outputs  over 


the  disturbances  .  Equation  31  is  patterned  roughly  after  (22).  We  have  added  the  sensor  noise. 


.„d  the  mitial  to  the  set  of  diaturbarc  signala  .0  inject  tradeoffa  for  noia.  rejection  and  aettling  time  into  the  problem. 

M.V.Q.  ani  ““ 

now  focnaing  on  nniaance  blocking.  Our  only  concern  with  p,  is  that  it  be  .iaible  at  the  output,  which  la  what  Propoaltion  1  guarantees. 


The  disturbance  attenuation 


problem  is  to  find  the  estimate  x  so  that  for  all  /12,  v  €  Lzlu  ,^2].  and  .t(ro)  €  3?.", 


Daf<r 


,  pMiti.e  teal  number  Y  la  called  the  dUurbanc,  anmuatUm  bound.  (C,A)  will  always  be  itaaumed  to  be  an  observable  pair. 
To  solve  this  problem,  we  convert  (3 1)  into  a  cost  function. 


where  we  have  used  (2)  to  rewrite  the  measurement  noise 


term.  Note  that  we  have  also  rewritten  the  initial  error  weighting,  defining 


jl^ y-ipo-  The  disturbance  attenuation  problem  is  then  solved  via  the  differential  game. 


minmaxmaxmaxy  <  0, 
I  y  1^2 


subject  to 


X  =  Ax-h  F2/X2» 


y  =  Cjc-I-v. 


The  solution  to  this  problem  (Chung  and  Speyer,  1998)  turns  out  to  be  a  Luenberger  Observer, 


x  =  Ai+Yn'‘C^V-'Cy-Ci),  i(to)=-2o, 


whose  gain  is  taken  from  the  solution  to  a  Riccati  Equation, 


-ri=A^n+nA+ ^nFzMFj  n + c^{hqh  -  yv~  ‘ 


n(fo)  =  Hq. 
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In  many  cases,  it  is  desired  to  extend  finite-time  solutions  of  game  theoretic  problems  to  the  steady-state  condition.  Whenever  it  is 
possible  to  find  such  a  solution,  the  optimal  estimator  will  be  given  by  (35)  with  n  being  the  solution  of  the  algebraic  Riccati  Equation, 

0  =  A^n  -1-  rw  -f  -UFzMFJ  Il  +  C'^{HQH-yV~^)C.  (37) 

Y 

However,  unlike  linear  quadratic  optimal  control  problems,  there  are  no  conditions  which  guarantee  the  existence  of  a  unique,  nonnega¬ 
tive  definite,  stabilizing  solution  to  the  steady-state  Riccati  Equation,  except  in  the  special  case  where  ^  is  asymptotically  stable  (Green 

and  Limebeer,  1995). 

The  Decentralized  Fault  Detection  Filter 

Given  the  results  of  the  previous  two  sections,  we  now  propose  a  decentralized  fault  detection  filtering  algorithm.  The  essential  idea 
is  to  implement  the  Riccati-based  game  theoretic  fault  detection  filter  as  a  decentralized  estimator.  An  overview  of  the  procedure  is  as 

follows: 

1.  Identify  the  sensors  and  actuators  which  must  be  monitored  at  the  global  level,  i.e.  define  the  target  faults  for  the  global  filter. 

2.  Identify  the  faults  which  should  be  included  in  the  global  nuisance  set.  The  remaining  faults  should  be  monitored  at  the  local  levels. 

3.  Derive  global  and  local  models  for  the  system  including  failure  ma;;s.  Chung  and  Speyer  ( 1998)  contains  a  brief  discussion  about  this 
process.  We  will  demonstrate  one  method  in  which  the  loca’  models  are  derived  from  the  global  model  via  a  minimum  realization. 

4.  Design  game  theoretic  fault  detection  filters  for  the  local  and  global  systems.  Solve  the  corresponding  Riccati  equations  and  store 
the  solutions  for  later  use. 

5.  Determine  the  blending  solutions  from  Equation  15. 

6.  Propagate  the  local  estimates  and  vectors  and  then  use  the  decentralized  estimation  algorithm  ( 10)  to  derive  a  global  estimate, 

je. 

7.  Determine  the  global  failure  signal  from  (y  —  Cx)  where  y  is  the  total  measurement  set,  C  is  the  global  measurement  matrix,  and  X 
is  the  global  fault  detection  filter  estimate  just  derived. 


We  will  now  apply  these  steps  in  an  example. 
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Range 


Car  #1 


J 


Car  #2 


Figure  1 .  Two-Car  Platoon  with  Range  Sensor 


Range  Sensor  Fault  Detection  In  a  Platoon  of  Cars 
Problem  Statement 

We  will  now  examine  the  utility  of  the  decentralized  approach  to  FDI  by  working  through  an  example.  The  problem  that  we  will 
look  at  involves  the  detection  of  failures  within  a  system  of  two  cars  traveling  as  a  platoon  (See  Figure  1).  The  cars  are  controlled  to 
maintain  a  uniform  speed  and  constant  separation.  The  platoon  is  the  central  component  of  automated  highway  schemes  in  which  groups 
of  cars  line  up  single  file  and  travel  as  a  unit.  The  objective  is  to  eliminate  the  backup  caused  by  the  interaction  of  individual  vehicles 
maneuvering  across  highway  lanes  (Douglas  et  al..  1995. 1996).  The  viability  of  the  platooning  scheme,  however,  will  depend  on  many 
factors,  not  the  least  of  which  are  reliability  and  safety. 

The  FDI  schemes  that  we  have  examined  to  this  point  are  capable  of  monitoring  individual  cars,  but  may  not  be  ideal  for  monitoring 
elements  that  deal  with  the  interactions  between  cars.  For  example,  to  maintain  uniform  spe^d  throughout  the  platoon  and  to  keep  the 
spacing  between  the  cars  constant,  additional  sensors  will  be  needed  to  measure  the  relative  speed  and  the  relative  distance,  or  "range”, 
between  the  cars.  In  order  to  detect  a  failure  in  the  range  sensor  using  analytical  redundancy,  however,  it  is  necessary  to  have  a  dynamical 
relationship  between  the  range  sensor  and  other  sensors  on  the  vehicles.  Range,  however,  involves  the  dynamics  of  both  of  the  cars  and 
so  would  require  a  higher-order  model  for  its  detection  filter. 

While  this  is  not  necessarily  prohibitive,  it  does  not  make  use  of  the  many  different  state  estimates  that  are  already  being  propagated 
throughout  the  platoon.  The  sensors  on  each  of  the  cars,  for  instance,  will  be  monitored  by  detection  filters,  and  it  is  more  than  likely 
that  a  state  estimate  would  also  be  generated  by  the  vehicles’  control  loops.  Given  these  pre-existing  estimates,  it  seems  logical  to  make 
use  of  the  decentralized  estimation  algorithm  to  carry  out  range  sensor  fault  detection. 
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System  Dynamics  and  Failure  Modeling 

Our  example  starts  with  the  car  model  used  in  Douglas  et  al.  ( 1995).  In  this  model,  the  nonlinear,  six  degree-of-freedom  dynamics 
of  an  representative  automobile  arc  linearized  about  a  straight,  level  path  at  a  speed  of  25  meters/sec  (roughly  56  miles  per  hour). 
The  linearized  equations  are  found  to  decouple  nicely  into  latitudinal  and  longitudinal  dynamics,  much  like  an  airplane.  Moreover, 
the  linearized  equations  can  be  further  reduced  by  eliminating  “fast  modes”  and  actuator  states.  For  simplicity,  we  will  only  use  the 
longitudinal  dynamics  which  we  represent  as 


X  = 

y  =  C‘‘x, 


where  the  superscript  “L”  stands  for  “longitudinal.”  The  vehicle  states  are 


X 


nia 

co^ 

Vjc 

z 

0 

v 


engine  air  mass  (kg) 
engine  speed  (rad/sec) 
long,  velocity  (m/sec) 

►  vertical  velocity  (m/sec) 
vertical  position  (m) 
pitch  rate  (rad/sec) 
pitch  (rad) 


(38) 


and  are  propagated  by  the  state  matrix* 


-0.087694  0.0038094  -0.12133  -0.010701  3.9941 

0.032194  -1.6765  57.123  7.2346  26.27 

4.6169<'-05  -0.021736  -22.56  0.11478  -0.00095051 

-0,075512  7.7689  -301.66  -38.647  -137.16 

-0.096212  -0.073026  2.498  0.2312  0.89067 

-0.94943  -0.26102  -0.20407  -0.067025  -0.41229 

-0,27186  0,92418  0.12024  0.19024  -0.010912 


42.617  1.2879 

-665.78  496.6 

7.765U-05  -4.5754^-05 
3612  -2816.7 

-19.054  9.0737 

-2.4689  0.16425 

-1.302  -1.434 


(39) 


The  measurements  are 


ma 

CO, 

Vjc 

©/ 
I  CD, 


engine  air  mass  (kg) 
engine  speed  (rad/sed) 
long,  acceleration  (m/sec^) 
y  heave  acceleration  (m/sec^) 
pitch  rate  (radl/sec) 

fivnt  symmetric  wheel  speed  (rad/sec) 
rear  symmetric  wheel  speed  (rad/sec) 


(40) 


with  the  corresponding  measurement  matrix, 


13 


Copyright  ©  1999  by  ASME 


(41) 


10  0  0  0  0  ..  0 

0  1  0  0  0  0  '  0 

0  0.0713  -0.8177  0.5934  6.7786  16.8068  1.5162 

0  -0.0020  0.0221  -3.5646  -40.4210  -9.0765  -0.8141 

0  0  0  0  0  0  1 

0  0  7.1220  -4.5806  -51.9152  58.8718  5.1944 

0  0.0888  5.9738  -3.5782  -40.5542  -56.4109  -4.9773 


The  rear  aei  tmn.  symnietric  wheel  speeds  ^  shhes  to,  wee=  .limlna,ed  when  to  fas,  modes  w«=  facmred  ou,  of  to  linearized 
system. 

1„  order  m  build  a  deleerion  filurr  for  ,he  range  sensor,  we  ne«l  ,o  use  (38-41)  m  build  sum  space  models  for  to  plaloon. 


fl  =  ATI  +  Flfl}  +  Fifth 
y  =  CT], 


and  the  two  individual  cars, 

fl'  =A''n*  +  FiV}  +  ^2/4) 
y'=E'r]\ 

=  A^Tl^  +  Fifi^  +  F2l4> 
=  E^r]^. 


We  will  build  up  our  models  with  the  following  steps; 


1.  Using  (38-41),  we  will  derive  the  global  state  matrices.  A  and  C. 

2.  Using  the  modelling  techniques  described  in  Douglas  (1993)  and  Chung  and  Speyer  (1998),  we  will  determine  the  failure  maps,  Fi. 

3.  We  will  then  obtain  the  local  state  matrices.  A'.E'',  and  Fj.  from  the  minimum  realization  of  the  triples  (C  .A.Fz)  and  (C^.A.Ft). 


Our  general  strategy  is  to  derive  the  global  equation  first  and  then  get  the  local  equations  from  decompositions  based  upon  observability 
and  controllability.  While  this  is  by  no  means  the  only  way  to  obtain  the  global  and  local  representations  of  a  system,  it  is  a  logical 

method  that  can  be  applied  to  any  problem. 

The  obvious  way  the  get  the  global  matrices.  A  and  C.  is  to  form  block  diagonal  composite  matrices  with  and  repeated  on  the 


diagonal,  i.e. 
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'A^-  0  ■ 

0  ’ 


0 

0  & 


This,  however,  is  not  sufficient,  since  there  is  no  way  to  describe  the  range,  R,  between  the  two  vehicles  with  the  given  states,  (38). 
Range  is  the  relative  distance  between  the  cars. 


where  is  the  longitudinal  displacement  of  car  /.  Displacement,  however,  is  not  a  state  of  the  vehicle  (38).  We  must,  therefore,  add  a 
range  state  to  the  platoon  dynamics,  using  the  equation, 

^  =  vi  - 


The  end  result  is  that  the  platoon  will  be  a  fifteen-state  system. 


to* 

vi 

v‘ 

,1 


9 

e' 


> 


engine  air  mass  (kg)  -  Car#l 
engine  speed  (rad/sec)  -  Car#l 
long,  velocity  (m/sec)  -  Car#l 
vertical  velocity  (m/sec)  -  Car#l 
vertical  position  (m)  -  Car#l 
pitch  rate  (rad/sec)  -  Car#l 
pitch  (rad)  -  Car#l 
engine  air  mass  (kg)  -  Car#2 
engine  speed  (rad/sec)  -  Car#2 
long,  velocity  (m/sec)  -  Car#2 
vertical  velocity  (m/sec)  -  Car#2 
vertical  position  (m)  -  Car#2 
pitch  rate  (rad/sec)  -  Car#2 
pitch  (rad)  -  Car#2 
Range  (m). 


The  corresponding  state  matrix  is 


A  = 


'A^  0 
0  A^ 
£i  -Ex 


£,  =  [0010000]. 


(42) 
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The  measurement  matrix  is 


c'- 

0 

0 

0  1 

C** 


(43) 


where  C‘  and  can  be  inferred  from  (43).  Finally,  the  local  measurement  sets  are 


y  = 


I  ^ 


to* 

v' 

vl 

-{ 

(oi 


engine  air  mass  (kg)-  Car#l 

engine  speed  (rad/sed)-  Car#l 

long,  acceleration  (m/sec^)-  Car#  1 

heave  acceleration  (m/sec^)-  Car#l 

pitch  rate  (rad/sec)-  Car#l 

front  symmetric  wheel  speed  (rad/sec)  -  Car#l 

rear  symmetric  wheel  speed  (rad/sec)  -  Car#l. 


and 


m; 


a 

0)2 

$ 

I 

R 


engine  air  mass  (kg)-  Car#2 

engine  speed  (rad/sed)-  Car#2 

long,  acceleration  (m/sec^)-  Car#2 

heave  acceleration  (m/sec^)-  Car#2 

pitch  rate  (rad/sec)-  Car#2 

front  symmetric  wheel  speed  (rad/sec)  -  Car#2 

rear  symmetric  wheel  speed  (rad/sec)  -  Car#2 

range  (rad/sec). 


Our  ultimate  objective  is  to  design  a  filter  which  will  detect  a  range  sensor  fault  in  the  presence  of  potential  failures  in  the  other 
sensors.  In  an  actual  health  monitoring  system,  we  would  design  the  global  filter  to  block  out  all  of  the  nuisance  faults  that  are  output 
separable  from  the  range  sensor  fault  and  then  rely  upon  the  loca'  filters  to  monitor  the  remaining  faults.  Given  the  size  of  our  exam¬ 
ple.  however,  the  full  analysis  required  to  do  a  detailed  design  would  clutter  our  presentation.  We  will,  therefore,  limit  ourselves  to 
constructing  only  one  local  filter  and  will  choose  simple  nuisance  sets  at  both  the  global  and  local  levels. 

For  this  example,  we  choose  to  monitor  the  front  symmetric  wheel  speed  sensor  at  the  local  level.  The  nuisance  set  is  then  chosen  to 
be  the  engine  air  mass  sensor  and  the  heave  accelerometer.  At  the  global  level,  the  range  sensor  has  already  been  designated  as  the  target 
fault.  We,  therefore,  complete  the  problem  definition  by  choosing  the  engine  speed  sensor  and  longitudinal  accelerometer  as  the  global 
nuisance  set.  There  is  no  particular  significance  attached  to  any  of  our  choices  for  the  nuisance  and  target  sets,  aside  from  the  choice  of 
the  range  sensor  as  the  global  target  fault 

Following  standard  modelling  techniques  Douglas  (1993);  Chung  and  Speyer  (1998),  we  construct  the  two  engine  speed  sensor 
failure  maps  F^i  and  To  save  space  we  do  not  list  these  matrices  out  explicitly.  The  interested  reader  can  refer  to  (Chung,  1997). 
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To  complete  the  problem  we  also  need  to  construct  maps  for  the  accelerometer  failures,  /v-  and  and  the  range  sensor.  Fr.  For  the 
local  filters,  failure  maps  need  to  be  constructed  for  the  airmass  sensors,  F„i  and  F„i,  heave  accelerometers,  F^  and  F^.  and  front  wheel 


speed  sensors.  F53,  and  F^ .  A  quick  application  of  (29)  will  show  that  all  of  our  failure  sets  are  output  separable. 

We  are  now  in  position  to  generate  the  local  state  equations.  The  local  dynamics  for  car  #1  come  from  the  minimum  realization  of 
(C'  ,A,  [  F„i  F^ii  1 ).  The  corresponding  matrices  are 


-0.087694  0.0038094  -0.12133  -0.010701 
0.032194  1.6765  57.123  7.2346 

4.6169e -05  -0.021736  -22.56  0.11478 

-0.075512  7.7689  -301.66  -38.647 

-0.096212  -0.073026  2.498  0.2312 

-0.94943  -0.26102  -0.20407  -0.067025 

-0.27186  0.92418  0.12024  0.19024 


3.9941  42.617 

26.27  -665.78 

-0.00095051  7.765  le- 05 
-137.16  3612 

0.89067  -19.054 

-0.41229  -2.4689 

-0.010912  -1.302 


1.2879 

496.6 

-4.5754e-05 

-2816.7 

9.0737 

0.16425 

-1.434 


E'  = 


0  0  10  0 
-0.00039519  0.18605  0  -0.98251  0.008136 

0.0043561  -0.014182  0  -0.090334  -0.2118 

0.00015951  -0.00067636  0  -0.0048006  -4.0642 
-0.00014266  -0.97872  0  -0.18537  0.0016064 

-0.00030256  0.0016942  0  0.0069288  1.4478 

0.0009564  -0.0038718  0  -0.019192  2.1041 


0 

0 

0.00066466  0.00039164 

11.266 

-14.31 

-41.318 

-2.4264 

0.024547 

-0.084511 

-34.102 

-71.377 

-55.207 

42.987 

0  -0.12133  ■ 

7.9031  -1.6879' 

0  57.1230 

-0.0007  -0.0213 

1  -22.5605 

0  0 

0  -301.6586 

-0.0048  -0.0057 

0  2.4980 

-0.1760  -0.7911 

0  -0.2041 

-0.0068  -7.4136 

0  0.12024 

-0.0003  -2.1388 

The  model  for  Car  #2  is  similarly  found  by  obtaining  the  minimum  realization  of  (C^  A,  [f„2  F^  ] ).  The  corresponding  matrices  are 


-0.26387  -0.27372  0.97419  -0.040683 

0 

0 

0 

0 

0.28256  0.2607  0.042752  1.0237 

0 

0 

0 

0 

-12.546  -12.054  -1.4539  -0.79488 

-0.0025104 

0.00016431  0.00013564 

0.034164 

-28.279  -27.514  -2.1059  -3.0468 

0.0048048 

8.1292 

6,7111 

-0.065389 

195.07  193.92  -2.3745  38.898 

-0.19848 

-152.87 

-126.21 

2.7044 

3.8593  4.598  0.3571  0.51933 

-4.1413e-06 

-21.332 

-18.419 

5.6359e-05 

-4.0915  -4.8456  -0.37617  -0.54711 

3.0926e-05 

22.827 

17.824 

-0.00042087 

-2654.8  -2639.1  32.315  -529.37 

-304.44 

2080.5 

1717.5 

-57.774 

0 

0 

-12.008 

5.9034 

-0.011223 

-43.291 

40.011 

0.69369 

u 


0  0 

0  0 

-11.76  -0.54089 

6.9535  0.53293 

0.011321  -0.73157 
-39.922  -8.6999 
39.775  -0.48704 

-0.71973  -0.014645 


0  0.99731 

0  0.073283 

-1.6668  0.0052249 

0.79148  -0.00014384 

-0.68157  0 

1.9601  0 

7.9783  0.0065064 

-0.0075736  0 


0  0 

0  0 

5.2402  4.3261 

-31.362  -25.727 

-0.0019527  -0.0016121 
-40.162  -33.156 

-31.356  -25.886 

-0.017553  -0.014491 


0.073283  ■ 
-0.99731 
-0.071106 
0.0019575 
0 
0 

-0.088546 

0 
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0  0 

0  0 

0  0 

0  0 

0.9973  0.0002 

0  0 

0  0 

0.0733  -307.8575 


fI 

v: 


0  0 

0  0 

0  0 

0  0 

0  0 

-5.0327  -4.9282 
6.0961  -6.2254 
0  0 


With  all  of  these  system  matrices  in  place,  we  can  now  form  the  residual  projectors  H  needed  generate  the  failure  signal,  z.  In  the 
global  filter,  we  define 


In  the  local  filters,  we  define 

f'=[F4F4]  .'=1,2 

The  projectors  H  and  H‘  are  then  found  by  applying  (23).  Again,  we  do  not  show  either  of  these  matrices  explicitly  to  save  space. 

Decentralized  Fault  Detection  Filter  Design 

We  will  first  design  filters  for  the  local  systems.  As  with  all  Riccati-based  filters,  the  central  step  in  the  process  is  in  obtaining  a 
solution  to  the  appropriate  Riccati  Equation.  For  simplicity,  we  will  use  the  steady-state  version.  Typically,  one  iteratsoon  the  design 
by  trying  various  combinations  of  weightings  until  a  Riccati  solution  is  found  which  leads  to  a  filter  that  gives  the  best  tradeoff  between 
target  fault  transmission  and  nuisance  fault  attenuation.  For  this  example,  it  was  found  that 

Af'  =  10x/7,  K*  =diag[l  1  10  1  1  1  1], 

q'=/7,  Y  =  0.18 

leads  to  the  filter  for  Car  #1  depicted  in  Figure  2.  The  minimum  separation  over  frequency  is  only  35  dB,  but  the  filter  has  particularly 
good  separation  in  the  low  frequency  range.  For  Car  #2.  the  same  weightings,  adjusted  for  the  different  dimensions  of  the  Car  #2 

dynamics. 


18 


Copyright  ©  1999  by  ASME 


Singular  Value  Plot  of  Local  Game  Theoretic  Filter  #1 


Figure  2.  Platoon  Example  -  Signal  Transmission  in  the  Local  Detection  RIter  on  Car  #  1  (accelerometer  fault  transmission  shown  with  solid  line,  nuisance 
fault  transmission  shown  with  clashed  line) 


=  10  X  /g, 

=  /8, 

lead  t6  a  filter  with  the  performance  depicted  in  Figure  3.  Finally,  for  the  global  system,  a  fault  detection  filter  for  range  sensor  health 
monitoring  in  the  platoon  is  found  by  solving  the  corresponding  Riccati  Equation  with  the  weightings: 

YV'=/|7,  Q  =  hi, 

M  =  100x/8,  Y  =  0.18. 

The  resulting  filter  has  the  properties  depicted  in  Figure  4.  The  decentralized  implementation  that  we  proposed  in  the  previous  section 
should  also  exhibit  this  level  of  performance.  As  a  check,  a  simple  time  domain  simulation  was  run  comparing  the  response  of  the 
residual  signal  when  the  system  is  driven  by  the  target  fault  (a  step  failure  of  the  range  sensor)  to  when  it  is  driven  by  a  nuisance  fault 
(a  step  failure  of  the  longitudinal  accelerometer  on  Car  #1).  Because  we  are  using  Riccati-based  estimators,  the  blending  matrices, 


V2  =  diag[l  1  10  1  1  1  1  1], 
Y  =  0.18, 


19 


Copyright  ©  1999  by  ASME 


Figure  3. 


Platoon  Example  -  Signal  Transmission  in  the  Local  Detection  Filter  on 


Car  #  2  (accelerometer  fault  transmission  shown  with  solid  line,  nuisance 


fault  transmission  shown  with  dashed  line) 


are  given  by  (15).  The  connecting  matrices  SJ  are  taken  to  be  the  pseudo-inverses  of  EJ.  As  Figure  5  shows,  the  resulting  decentralized 
fault  detection  filter  does  a  good  job  of  distinguishing  the  target  fault  from  the  nuisance  fault. 

Remark  2  It  must  be  noted  that  we  have  assumed  that  the  lead  car  will  transmit  its  measurements  y\  its  local  state  estimates  and 
the  vector  /:'  back  to  car  #2  so  that  the  latter  can  form  the  global  estimate  via  the  decentralized  estimation  algorithm.  Transmission 
issues  and  limitations,  quite  obvioutiy.  open  up  the  potential  for  new  problems.  We  have  also  assumed  that  each  car  will  have  stored 
on-board  the  needed  Riccati  solutions  for  all  likely  scenarios. 


Conclusions 

In  this  paper,  we  have  introduced  a  decentralized  fault  detection  filter  which  provides  an  alternative  way  to  monitor  large-scale 
systems  for  faults.  The  resulting  filter  has  additional  fault  tolerance  because  it  can  check  the  health  of  its  contituent  sensors  pnor  to 
deriving  the  top  level  estimate  and  it  is  easily  scalable  for  problems  which  are  varying  in  size  such  as  collections  of  systems. 
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